Getting Tensorflow Extended (TFX) 1.14.0 to work with apple silicon natively
Dec 18, 2023 · 582 words · 3 minute read
TFX is not compatible with Apple Silicon yet, though there are a few pull requests in flight to make this happen. If you are willing to build your own wheels against the 1.14.0
tags with the relevant pull requests, it is possible to run TFX natively, though I have no idea how to run the appropriate test suites to verify full compatibility. See this post for how to get TFX master working w/TF 2.15.0
You will need to build patched 1.14.0
versions of
google/ml-metadata
[Pull Request] [Branch w/PR applied to 1.14.0 tag]tensorflow/tfx-bsl
[Pull Request] [Branch w/PR applied to 1.14.0 tag],tensorflow/data-validation
[Pull Request] [Branch w/PR applied to 1.14.0 tag]
Installing them to a venv
or conda/mamba
environment should then allow you to install tfx==1.14.0
. All the usual caveats of installing hand-rolled versions of libraries apply. Here be dragons!
Building and installing
Pre-requisites:
Xcode
>=15 (Xcode Command Line tools
being at 15 is not enough, you really wantXcode
)Cmake
andBazelisk
(homebrew
install works)Python
3.9 or 3.10
I’ll use venv
but conda
like environments should work too. I’ve tried this with an M1 Pro
on Sonoma 14.2
and Xcode 15.0
, Python 3.9, 3.10
and micromamba/venv
but YMMV
Steps:
- Create and activate your
venv
python -m venv .venv . .venv/bin/activate pip install -U pip wheel
- Pin the version of
Bazel
to5.3.2
(from ml-metadata’s version)export USE_BAZEL_VERSION=5.3.2
- Clone and build the required projects:
-
google/ml-metadata
git clone https://github.com/tangm/ml-metadata.git cd ml-metadata git checkout v1.14.0-m1fix python setup.py bdist_wheel pip install dist/ml_metadata-1.14.0-cp310-cp310-macosx_11_0_universal2.whl
-
tensorflow/tfx-bsl
git clone https://github.com/tangm/tfx-bsl.git cd tfx-bsl git checkout r1.14.0-48-Allow-compilation-on-m1-macs pip install numpy # (per `tfx-bsl` source building instructions) python setup.py bdist_wheel pip install dist/tfx_bsl-1.14.0-cp310-cp310-macosx_11_0_universal2.whl jsonschema==4.17.3
- If you don’t specify the
jsonschema
dependency, you will probably see an error like:
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. ml-metadata 1.14.0 requires attrs<22,>=20.3, but you have attrs 23.1.0 which is incompatible.
- This is because the constraint from
tfx-bsl
isapache-beam[gcp] >= 2.47.0, <3
, and the latest, matching version ofapache-beam
is2.52.0
which brings in an incompatible versionattrs
from itsjsonschema
dependency. We’ll run into this again later, but for now, compatible versions can be found from this issue for the official tfx 1.14.0 docker image
- If you don’t specify the
-
tensorflow/data-validation
git clone https://github.com/tangm/data-validation.git cd data-validation git checkout r1.14.0-205-allow-apple-silicon python setup.py bdist_wheel pip install dist/tensorflow_data_validation-1.14.0-cp310-cp310-macosx_11_0_universal2.whl
-
And finally install
tfx
!pip install tfx==1.14.0 jsonschema==4.17.3
- Like before, the
jsonschema
pin is due to transitive dependencies
- Like before, the
-
Due to this bug , pre-emptively run:
pip install -U google-cloud-aiplatform "shapely<2"
-
Caveats
I ran into something like:
Library not loaded: @rpath/libc++.1.dylib
originating from pyfarmhash
. Rebuilding using:
pip install pyfarmhash --force-reinstall --no-cache-dir
seemed to work, though I’m sure more fundamental solutions exist.
Sanity Testing
python -c "from tfx import version ; print('TFX version: {}'.format(version.__version__))"
Should show 1.14.0
Using the penguin
template:
export PIPELINE_NAME=sanity_check
export PROJECT_DIR=$PWD/$PIPELINE_NAME
tfx template copy \
--pipeline_name="${PIPELINE_NAME}" \
--destination_path="${PROJECT_DIR}" \
--model=penguin
cd sanity_check
tfx pipeline create --engine=local --pipeline_path=local_runner.py
tfx run create --engine=local --pipeline_name="${PIPELINE_NAME}"
If you update the pipeline/pipeline.py
file to uncomment other components like in the tutorials, remember to update the pipeline before running it again
tfx pipeline update --engine=local --pipeline_path=local_runner.py
tfx run create --engine=local --pipeline_name="${PIPELINE_NAME}"
Thanks
Big props to @nicholasjng who did most of the work on the ml-metadata
pr and to @IzakMaraisTAL for helpfully documenting the dependencies in the docker issue. Please vote and comment on the PRs if you want this to actually be a thing and avoid hand rolling stuff.