Getting Tensorflow Extended (TFX) 1.14.0 to work with apple silicon natively
Dec 18, 2023 · 582 words · 3 minute read
TFX is not compatible with Apple Silicon yet, though there are a few pull requests in flight to make this happen. If you are willing to build your own wheels against the 1.14.0 tags with the relevant pull requests, it is possible to run TFX natively, though I have no idea how to run the appropriate test suites to verify full compatibility. See this post for how to get TFX master working w/TF 2.15.0
You will need to build patched 1.14.0 versions of
google/ml-metadata[Pull Request] [Branch w/PR applied to 1.14.0 tag]tensorflow/tfx-bsl[Pull Request] [Branch w/PR applied to 1.14.0 tag],tensorflow/data-validation[Pull Request] [Branch w/PR applied to 1.14.0 tag]
Installing them to a venv or conda/mamba environment should then allow you to install tfx==1.14.0. All the usual caveats of installing hand-rolled versions of libraries apply. Here be dragons!
Building and installing
Pre-requisites:
Xcode>=15 (Xcode Command Line toolsbeing at 15 is not enough, you really wantXcode)CmakeandBazelisk(homebrewinstall works)Python3.9 or 3.10
I’ll use venv but conda like environments should work too. I’ve tried this with an M1 Pro on Sonoma 14.2 and Xcode 15.0, Python 3.9, 3.10 and micromamba/venv but YMMV
Steps:
- Create and activate your
venvpython -m venv .venv . .venv/bin/activate pip install -U pip wheel - Pin the version of
Bazelto5.3.2(from ml-metadata’s version)export USE_BAZEL_VERSION=5.3.2 - Clone and build the required projects:
-
google/ml-metadatagit clone https://github.com/tangm/ml-metadata.git cd ml-metadata git checkout v1.14.0-m1fix python setup.py bdist_wheel pip install dist/ml_metadata-1.14.0-cp310-cp310-macosx_11_0_universal2.whl -
tensorflow/tfx-bslgit clone https://github.com/tangm/tfx-bsl.git cd tfx-bsl git checkout r1.14.0-48-Allow-compilation-on-m1-macs pip install numpy # (per `tfx-bsl` source building instructions) python setup.py bdist_wheel pip install dist/tfx_bsl-1.14.0-cp310-cp310-macosx_11_0_universal2.whl jsonschema==4.17.3- If you don’t specify the
jsonschemadependency, you will probably see an error like:
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. ml-metadata 1.14.0 requires attrs<22,>=20.3, but you have attrs 23.1.0 which is incompatible.- This is because the constraint from
tfx-bslisapache-beam[gcp] >= 2.47.0, <3, and the latest, matching version ofapache-beamis2.52.0which brings in an incompatible versionattrsfrom itsjsonschemadependency. We’ll run into this again later, but for now, compatible versions can be found from this issue for the official tfx 1.14.0 docker image
- If you don’t specify the
-
tensorflow/data-validationgit clone https://github.com/tangm/data-validation.git cd data-validation git checkout r1.14.0-205-allow-apple-silicon python setup.py bdist_wheel pip install dist/tensorflow_data_validation-1.14.0-cp310-cp310-macosx_11_0_universal2.whl -
And finally install
tfx!pip install tfx==1.14.0 jsonschema==4.17.3- Like before, the
jsonschemapin is due to transitive dependencies
- Like before, the
-
Due to this bug , pre-emptively run:
pip install -U google-cloud-aiplatform "shapely<2"
-
Caveats
I ran into something like:
Library not loaded: @rpath/libc++.1.dylib
originating from pyfarmhash. Rebuilding using:
pip install pyfarmhash --force-reinstall --no-cache-dir
seemed to work, though I’m sure more fundamental solutions exist.
Sanity Testing
python -c "from tfx import version ; print('TFX version: {}'.format(version.__version__))"
Should show 1.14.0
Using the penguin template:
export PIPELINE_NAME=sanity_check
export PROJECT_DIR=$PWD/$PIPELINE_NAME
tfx template copy \
--pipeline_name="${PIPELINE_NAME}" \
--destination_path="${PROJECT_DIR}" \
--model=penguin
cd sanity_check
tfx pipeline create --engine=local --pipeline_path=local_runner.py
tfx run create --engine=local --pipeline_name="${PIPELINE_NAME}"
If you update the pipeline/pipeline.py file to uncomment other components like in the tutorials, remember to update the pipeline before running it again
tfx pipeline update --engine=local --pipeline_path=local_runner.py
tfx run create --engine=local --pipeline_name="${PIPELINE_NAME}"
Thanks
Big props to @nicholasjng who did most of the work on the ml-metadata pr and to @IzakMaraisTAL for helpfully documenting the dependencies in the docker issue. Please vote and comment on the PRs if you want this to actually be a thing and avoid hand rolling stuff.