Tensorflow wasm

Tensorflow wasm DEFAULT

https://blog.tensorflow.org/2020/03/introducing-webassembly-backend-for-tensorflow-js.html

https://2.bp.blogspot.com/-kMQsPQtd1Gg/XmfpGWQRP8I/AAAAAAAAC0g/WM4UwFUvs1gLER5LkCNiGCKQopdGu9DmgCLcBGAsYHQ/s1600/Screen%2BShot%2B2020-03-10%2Bat%2B12.22.51%2BPM.png

March 11, 2020 — Posted by Daniel Smilkov, Nikhil Thorat, and Ann Yuan, Software Engineers at Google

We’re happy to announce that TensorFlow.js now provides a WebAssembly (WASM) backend for both the browser and for Node.js! This backend is an alternative to the WebGL backend, bringing fast CPU execution with minimal code changes. This backend helps improve performance on a broader set of devices, especially lower…

Posted by Daniel Smilkov, Nikhil Thorat, and Ann Yuan, Software Engineers at Google

We’re happy to announce that TensorFlow.js now provides a WebAssembly(WASM) backend for both the browser and for Node.js! This backend is an alternative to the WebGL backend, bringing fast CPU execution with minimal code changes. This backend helps improve performance on a broader set of devices, especially lower-end mobile devices that lack WebGL support or have a slow GPU. It uses the XNNPacklibrary to accelerate the operations.

Installation

There are two ways to use the new WASM backend:
  1. With NPM
    The library expects the WASM binary to be relative to the main JS file. If you’re using a bundler such as parcel or webpack, you may need to manually indicate the location of the WASM binary with our helper:
    See the “Using bundlers” section in our README for more information.
  2. With script tags
    NOTE: TensorFlow.js defines a priority for each backend and will automatically choose the best supported backend for a given environment. Today, WebGL has the highest priority, followed by WASM, then the vanilla JS backend. To always use the WASM backend, we need to explicitly call ``.

Demo

Check out the face detection demo (using the MediaPipe BlazeFace model) that runs on the WASM backend. For more details about the model, see this blog post.

Why WASM?

WASM is a cross-browser, portable assembly, and binary format for the web that brings near-native code execution speed on the web. It was introduced in 2015 as a new web-based binary format, providing programs written in C, C++, or Rust, a compilation target for running on the web. WASM has been supported by Chrome, Safari, Firefox, and Edge since 2017, and is supported by 90%of devices worldwide.

Performance

Versus JavaScript:WASM is generally much faster than JavaScript for numeric workloads common in machine learning tasks. Additionally, WASM can be natively decoded up to 20x fasterthan JavaScript can be parsed. JavaScript is dynamically typed and garbage collected, which can cause significant non-deterministic slowdowns at runtime. Additionally, modern JavaScript libraries (such as TensorFlow.js) use compilation tools like TypeScript and ES6 transpilers that generate ES5 code (for wide browser support) that is slower to execute than vanilla ES6 JavaScript.

Versus WebGL:For most models, the WebGL backend will still outperform the WASM backend, however WASM can be faster for ultra-lite models (less than 3MB and 60M multiply-adds). In this scenario, the benefits of GPU parallelization are outweighed by the fixed overhead costs of executing WebGL shaders. Below we provide guidelines for finding this line. However, there is a WASM extensionproposal to add SIMDinstructions, allowing multiple floating point operations to be vectorized and executed in parallel. Preliminary tests show that enabling these extensions brings 2-3x speedup over WASM today. Keep an eye out for this to land in browsers! It will automatically be turned on for TensorFlow.js.

Portability and Stability

When it comes to machine learning, numerical precision matters. WASM natively supports floating point arithmetic, whereas the WebGL backend requires the OES_texture_float extension. Not all devices support this extension, which means a GPU-accelerated TensorFlow.js isn’t supported on some devices (e.g. older mobile devices where WASM is supported).

Moreover, GPU drivers can be hardware-specific and different devices can have precision problems. On iOS, 32 bit floats aren’t supported on the GPU so we fall back to 16 bit floats, causing precision problems. In WASM, computation will always happen in 32 bit floats and thus have precision parity across all devices.

When should I use WASM?

In general, WASM is a good choice when models are smaller, if you care about wide device support, or if your project is sensitive to numerical stability. WASM, however, doesn’t have parity with our WebGL backend. If you are using the WASM backend and need an op to be implemented, feel free to file an issue on Github. To address the needs for production use-cases, we prioritized inferenceover trainingsupport. For training models in the browser, we recommend using the WebGL backend.

In Node.js, the WASM backend is a great solution for devices that don’t support the TensorFlow binary or you don’t want to build it from source.

The table below shows inference times (in milliseconds) in Chrome on a 2018 MacBook Pro (Intel i7 2.2GHz, Radeon 555X) for several of our officially supported modelsacross the WebGL, WASM, and plain JS (CPU) backends.
We observe the WASM backend to be 10-30x faster than the plain JS (CPU) backend across our models. Comparing WASM to WebGL, there are two main takeaways:
  1. WASM is on-par, or faster than WebGL for ultra-lite models like MediaPipe’s BlazeFace and FaceMesh.
  2. WASM is 2-4X slower than WebGL for medium-sized edge models like MobileNet, BodyPix and PoseNet.

Looking ahead

We believe WASM will be an increasingly preferred backend. In the last year we have seen a wave of production quality ultra-light models designed for edge devices (e.g. MediaPipe’s BlazeFaceand FaceMesh), for which the WASM backend is ideally suited.

In addition, new extensions such as SIMDand threadsare actively being developed which will enable further acceleration in the future.

SIMD / QFMA

There is a WASM extensionproposal to add SIMD instructions. Today, Chrome has partial support for SIMD under an experimental flag, Firefox and Edge status is in development, while Safari hasn’t given any public signal. SIMD is hugely promising. Benchmarks with SIMD-WASM on popular ML models show 2-3X speedupover non-SIMD WASM.

In addition to the original SIMD proposal, the LLVM WASM backend recently got supportfor experimental QFMA SIMD instructions that should further improve performance of kernels. Benchmarks on popular ML models show QFMA SIMD giving an additional 26-50% speedupover regular SIMD.

The TF.js WASM backend will take advantage of SIMD through the XNNPACKlibrary, which includes optimized micro-kernelsfor WASM SIMD. When SIMD lands, this will be invisible to the TensorFlow.js user.

Multithreading

The WASM spec recently got a thread and atomicsproposal with the goal to speed up multi-threaded applications. The proposal is in early stage, meant to seed a future W3C Working Group. Notably, Chrome 74+ has supportfor WASM threads enabled by default.

When the threading proposal lands, we will be ready to take advantage of threads through the XNNPACK library with no changes to TensorFlow.js user code.

More information

  • If you are interested in learning more, you can read our WebAssembly guide.
  • Learn more about WebAssembly by checking out this collection of resources by the Mozilla Developer Network.
  • We’d appreciate your feedback and contributions via issues and PRs on GitHub!
Sours: https://blog.tensorflow.org/2020/03/introducing-webassembly-backend-for-tensorflow-js.html

Why Even WASM?

Web Assembly (WASM) is taking the web world by storm. It’s increasing speed and efficiency for complex web actions, and it’s a perfect application for advanced processes like TensorFlow.js.

Before TensorFlow officially supported WASM, we had to write our own WASM code to read models directly in the browser, which gave us jaw-dropping results but took profound effort and time.

If you’re looking to bring AI/ML to your website or mobile app, let’s chat.

As of this writing, TensorFlow.js has a semi-functional WASM backend for running models in the browser, and you can add it as an option to your TensorFlow websites. Let’s take a tour of what you’ll have to do to get this alpha feature into your Create React App applications.

For machines with little or no GPU, the WASM backend creates a significant speed boost over traditional JavaScript and on very lite models the WASM backend is comparable to WebGL with better numerical stability.

Let’s learn how to add it to a Create React App website!

As of this writing, there’s currently no clean way (that I’ve found) to connect WASM with Create React App. I can do it with straight up webpack, but CRA has a complex pipeline that causes lots of issues. I’ve tried +, , and a couple of wild tricks like . I can say I got close several times, but according to Create React App’s GitHub, it’s still not been done.

For this reason, I’ll be skipping the complexity entirely, and going with a non-ejected, simple, albeit slightly hacky solution that works.

I assume your project already has TensorFlow.js, but just to be sure, I’ve included it in the install directions for TFJS and TFJS Backend WASM below.

# NPM
npm i @tensorflow/tfjs @tensorflow/tfjs-backend-wasm --save
# Yarn
yarn add @tensorflow/tfjs @tensorflow/tfjs-backend-wasm

Here’s the dirty part:

We need the WASM file from but of all the dirty solutions for Create React App, this is the cleanest I could come up with. We’re going to copy the file to the folder. To do this, we’ll add a new script called “wasm” to our file.

This copies the .wasm file from the node package to the public folder.

As you can see, we place this before start and build scripts so we always have the latest file from our local NPM module. As a bonus, the command is the Windows version. Modify the above to use or depending on your machine’s particulars.

Now we can tell our TensorFlow.js app to use WASM as the backend in a few lines. We simply set the wasm path to our public .wasm file.

Sours: https://shift.infinite.red/adding-tensorflow-js-wasm-backend-in-create-react-app-f57f5baab736
  1. Constipation fever headache
  2. Samantha vera
  3. Rockford motor speedway
  4. Usssa georgia

Usage

This package adds a WebAssembly backend to TensorFlow.js. It currently supports the following models from our models repo:

  • BlazeFace
  • BodyPix
  • CocoSSD
  • Face landmarks detection
  • HandPose
  • KNN classifier
  • MobileNet
  • PoseDetection
  • Q&A
  • AutoML Image classification
  • AutoML Object detection

Importing the backend

Via NPM

// Import @tensorflow/tfjs or @tensorflow/tfjs-coreimport*astffrom'@tensorflow/tfjs';// Adds the WASM backend to the global backend registry.import'@tensorflow/tfjs-backend-wasm';// Set the backend to WASM and wait for the module to be ready.tf.setBackend('wasm').then(()=>main());

Via a script tag

<!-- Import @tensorflow/tfjs or @tensorflow/tfjs-core --><scriptsrc="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs"></script><!-- Adds the WASM backend to the global backend registry --><scriptsrc="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs-backend-wasm/dist/tf-backend-wasm.js"></script><script>tf.setBackend('wasm').then(()=>main());</script>

Setting up cross-origin isolation

Starting from Chrome 92 (to be released around July 2021), cross-origin isolation needs to be set up in your site in order to take advantage of the multi-threading support in WASM backend. Without this, the backend will fallback to the WASM binary with SIMD-only support (or the vanila version if SIMD is not enabled). Without multi-threading support, certain models might not achieve the best performance.

Here are the high-level steps to set up the cross-origin isolation. You can learn more about this topic here.

  1. Send the following two HTTP headers when your main document (e.g.index.html) that uses the WASM backend is served. You may need to configure or ask your web host provider to enable these headers.

    • If you are loading the WASM backend from through the script tag, you are good to go. No more steps are needed.

      If you are loading the WASM backend from your own or other third-party servers, you need to make sure the script is served with either CORS or CORP header.

      • CORS header: . In addition, you will also need to add the "crossorigin" attribute to your script tags.

      • CORP header:

        • If the resource is loaded from the same origin as your main site (e.g. main site: mysite.com/, script: mysite.com/script.js), set:

        • If the resource is loaded from the same site but cross origin (e.g. main site: mysite.com/, script: static.mysite.com:8080/script.js), set:

        • If the resource is loaded from the cross origin(s) (e.g. main site: mysite.com/, script: mystatic.com/script.js), set:

    If the steps above are correctly done, you can check the Network tab from the console and make sure the WASM binary is loaded.

    Threads count

    By default, the backend will use the number of logical CPU cores as the threads count when creating the threadpool used by XNNPACK. You can use the API to manually set it (must be called before calling ). API can be used to get the actual number of threads being used (must be called after the WASM backend is initialized).

    Via NPM

    import*astffrom'@tensorflow/tfjs';import{getThreadsCount,setThreadsCount}from'@tensorflow/tfjs-backend-wasm';setThreadsCount(2);tf.setBackend('wasm').then(()=>{console.log(getThreadsCount());});

    Via script tag

    tf.wasm.setThreadsCount(2);tf.setBackend('wasm').then(()=>{consosle.log(tf.wasm.getThreadsCount());});

    Running MobileNet

    asyncfunctionmain(){letimg=tf.browser.fromPixels(document.getElementById('img')).resizeBilinear([224,224]).expandDims(0).toFloat();letmodel=awaittf.loadGraphModel('https://tfhub.dev/google/imagenet/mobilenet_v2_100_224/classification/2',{fromTFHub: true});consty=model.predict(img);y.print();}main();

    Our WASM backend builds on top of the XNNPACK library which provides high-efficiency floating-point neural network inference operators.

    Using bundlers

    The shipped library on NPM consists of 2 files:

    • the main js file (bundled js for browsers)
    • the WebAssembly binary in

    There is a proposal to add WASM support for ES6 modules. In the meantime, we have to manually read the wasm file. When the WASM backend is initialized, we make a / for relative from the main js file. This means that bundlers such as Parcel and WebPack need to be able to serve the file in production. See starter/parcel and starter/webpack for how to setup your favorite bundler.

    If you are serving the files from a different directory, call with the location of that directory before you initialize the backend:

    import{setWasmPaths}from'@tensorflow/tfjs-backend-wasm';// setWasmPaths accepts a `prefixOrFileMap` argument which can be either a// string or an object. If passing in a string, this indicates the path to// the directory where your WASM binaries are located.setWasmPaths('www.yourdomain.com/');tf.setBackend('wasm').then(()=>{...});

    If the WASM backend is imported through tag, needs to be called on the object:

    tf.wasm.setWasmPaths('www.yourdomain.com/');

    Note that if you call with a string, it will be used to load each binary (SIMD-enabled, threading-enabled, etc.) Alternatively you can specify overrides for individual WASM binaries via a file map object. This is also helpful in case your binaries have been renamed.

    For example:

    import{setWasmPaths}from'@tensorflow/tfjs-backend-wasm';setWasmPaths({'tfjs-backend-wasm.wasm': 'www.yourdomain.com/renamed.wasm','tfjs-backend-wasm-simd.wasm': 'www.yourdomain.com/renamed-simd.wasm','tfjs-backend-wasm-threaded-simd.wasm': 'www.yourdomain.com/renamed-threaded-simd.wasm'});tf.setBackend('wasm').then(()=>{...});

    If you are using a platform that does not support fetch directly, please set the optional argument to :

    import{setWasmPath}from'@tensorflow/tfjs-backend-wasm';constusePlatformFetch=true;setWasmPaths(yourCustomPathPrefix,usePlatformFetch);tf.setBackend('wasm').then(()=>{...});

    JS Minification

    If your bundler is capable of minifying JS code, please turn off the option that transforms into . For example, in terser, the option is called "typeofs" (located under the Compress options section). Without this feature turned off, the minified code will throw "_scriptDir is not defined" error from web workers when running in browsers with SIMD+multi-threading support.

    Use with Angular

    If you see the error when building your Angular app, make sure to add to the field in your (or ):

    By default, the generated Angular app sets this field to an empty array which will prevent the Angular compiler from automatically adding "global types" (such as ) defined in files to your app.

    Benchmarks

    The benchmarks below show inference times (ms) for two different edge-friendly models: MobileNet V2 (a medium-sized model) and Face Detector (a lite model). All the benchmarks were run in Chrome 79.0 using this benchmark page across our three backends: Plain JS (CPU), WebGL and WASM. Inference times are averaged across 200 runs.

    MobileNet V2

    MobileNet is a medium-sized model with 3.48M params and ~300M multiply-adds. For this model, the WASM backend is between ~3X-11.5X faster than the plain JS backend, and ~5.3-7.7X slower than the WebGL backend.

    MobileNet inference (ms)WASMWebGLPlain JSWASM + SIMDWASM + SIMD + threads
    iPhone X147.120.3941.3N/AN/A
    iPhone XS14018.1426.4N/AN/A
    Pixel 418276.4162882N/A
    ThinkPad X1 Gen6 w/Linux122.744.81489.434.612.4
    Desktop Windows123.141.6111737.2N/A
    Macbook Pro 15 201998.419.6893.530.210.3
    Node v.14 on Macbook Pro290N/A1404.364.2N/A

    Face Detector

    Face detector is a lite model with 0.1M params and ~20M multiply-adds. For this model, the WASM backend is between ~8.2-19.8X faster than the plain JS backend and comparable to the WebGL backend (up to ~1.7X faster, or 2X slower, depending on the device).

    Face Detector inference (ms)WASMWebGLPlain JSWASM + SIMDWASM + SIMD + threads
    iPhone X22.413.5318N/AN/A
    iPhone XS21.410.5176.9N/AN/A
    Pixel 4282836815.9N/A
    Desktop Linux12.612.7249.58.06.2
    Desktop Windows16.27.1270.97.5N/A
    Macbook Pro 15 201913.622.7209.17.94.0

    When should I use the WASM backend?

    You should always try to use the WASM backend over the plain JS backend since it is strictly faster on all devices, across all model sizes. Compared to the WebGL backend, the WASM backend has better numerical stability, and wider device support. Performance-wise, our benchmarks show that:

    • For medium-sized models (~100-500M multiply-adds), the WASM backend is several times slower than the WebGL backend.
    • For lite models (~20-60M multiply-adds), the WASM backend has comparable performance to the WebGL backend (see the Face Detector model above).

    We are committed to supporting the WASM backend and will continue to improve performance. We plan to follow the WebAssembly standard closely and benefit from its upcoming features such as multi-threading.

    How many ops have you implemented?

    See for an up-to-date list of supported ops. We love contributions. See the contributing document for more info.

    Do you support training?

    Maybe. There are still a decent number of ops that we are missing in WASM that are needed for gradient computation. At this point we are focused on making inference as fast as possible.

    Do you work in node?

    Yes. If you run into issues, please let us know.

    Do you support SIMD and multi-threading?

    Yes. We take advantage of SIMD and multi-threading wherever they are supported by testing the capabilities of your runtime and loading the appropriate WASM binary. If you intend to serve the WASM binaries from a custom location (via ), please note that the SIMD-enabled and threading-enabled binaries are separate from the regular binary.

    How do I give feedback?

    We'd love your feedback as we develop this backend! Please file an issue here.

    Emscripten installation

    The Emscripten installation necessary to build the WASM backend is managed automatically by the Bazel Emscripten Toolchain.

    Building

    Testing

    Deployment

    ./scripts/build-npm.sh npm publish
    Sours: https://github.com/tensorflow/tfjs/blob/master/tfjs-backend-wasm/README.md
    Tensorflow/Tensorflow JS install and setup tutorial using Anaconda and Node.js

    https://1.bp.blogspot.com/-_oBmlCkqxIw/X07eAr--xQI/AAAAAAAADiQ/GB-O6TRgkJYvxmtpeyffo4iTlGTmTHuLACLcBGAsYHQ/s1600/table1.png

    September 02, 2020 — Posted by Ann Yuan and Marat Dukhan, Software Engineers at Google

    In March we introduced a new WebAssembly (Wasm) accelerated backend for TensorFlow.js (scroll further down to learn more about Wasm and why this is important). Today we are excited to announce a major performance update: as of TensorFlow.js version 2.3.0, our Wasm backend has become up to 10X faster by leveraging SIMD (vector) inst…

    Supercharging the TensorFlow.js WebAssembly backend with SIMD and multi-threading

    Posted by Ann Yuan and Marat Dukhan, Software Engineers at Google

    In March we introduceda new WebAssembly (Wasm) accelerated backend for TensorFlow.js (scroll further down to learn more about Wasm and why this is important). Today we are excited to announce a major performance update: as of TensorFlow.js version 2.3.0, our Wasm backend has become up to 10X faster by leveraging SIMD (vector) instructionsand multithreadingvia XNNPACK, a highly optimized library of neural network operators.

    Benchmarks

    SIMD and multithreading bring major performance improvements to our Wasm backend. Below are benchmarks in Google Chrome that demonstrate the improvements on BlazeFace- a light model with 0.1 million parameters and about 20 million multiply-add operations:

    (times listed are milliseconds per inference)

    Larger models, such as MobileNet V2, a medium-sized model with 3.5 million parameters and roughly 300 million multiply-add operations, attain even greater speedups:

    *Note: Benchmarks for the TF.js multi-threaded Wasm backend are not available for Pixel 4 because multi-threading support in mobile browsers is still a work-in-progress. SIMD support in iOS is also still under development.

    **Note: Node support for the TF.js multi-threaded Wasm backend is coming soon.

    The performance gains from SIMD and multithreading are independent of each other. These benchmarks show that SIMD brings a 1.7-4.5X performance improvement to plain Wasm, and multithreading brings another 1.8-2.9X speedup on top of that.


    Usage

    SIMD is supported as of TensorFlow.js 2.1.0, and multithreading is supported as of TensorFlow.js 2.3.0.

    At runtime we test for SIMD and multithreading support and serve the appropriate Wasm binary. Today we serve a different binary for each of the following cases:
    • Default: The runtime does not support SIMD or multithreading
    • SIMD: The runtime supports SIMD but not multithreading
    • SIMD + multithreading: The runtime supports SIMD and multithreading
    Since most runtimes that support multi-threading also support SIMD, we decided to omit the multi-threading-only runtime to keep our bundle size down. This means that if your runtime supports multithreading but not SIMD, you will be served the default binary. There are two ways to use the Wasm backend:
    1. With NPM
      The library expects the Wasm binaries to be located relative to the main JS file. If you’re using a bundler such as parcel or webpack, you may need to manually indicate the location of the Wasm binaries with our helper:
      See the “Using bundlers” section in our README for more information.
    2. With script tags
      NOTE: TensorFlow.js defines a priority for each backend and will automatically choose the best supported backend for a given environment. Today, WebGL has the highest priority, followed by Wasm, then the vanilla JS backend. To always use the Wasm backend, we need to explicitly call .

    Demo

    To see the performance improvements for yourself, check out this demo of our BlazeFace model, which has been updated to use the new Wasm backend: https://tfjs-wasm-simd-demo.netlify.app/To compare against the unoptimized binary, try this version of the demo, which manually turns off SIMD and multithreading support.

    What is Wasm?

    WebAssembly (Wasm) is a cross-browser binary format that brings near-native code execution speed to the web. Wasm serves as a compilation target for programs written in statically typed high level languages, such as C, C++, Go, and Rust. In TensorFlow.js, we implement our Wasm backend in C++ and compile with Emscripten. The XNNPACK library provides a heavily optimized implementation of neural network operators underneath.

    Wasm has been supported by Chrome, Safari, Firefox, and Edge since 2017, and is supported by 90%of devices worldwide.

    The WebAssembly specification is evolving quickly and browsers are working hard to support a growing number of experimental features. You can visit this siteto see which features are supported by your runtime, including:
    1. SIMD
      SIMD stands for Single Instruction, Multiple Data, which means that SIMD instructions operate on small fixed-size vectors of elements rather than individual scalars. The Wasm SIMD proposal makes the SIMD instructions supported by modern processors usable inside Web browsers, unlocking significant performance gains.

      Wasm SIMD is a phase 3 proposal, and is available via an origin trial in Chrome 84-86. This means developers can opt in their websites to Wasm SIMD and all their visitors will enjoy its benefits without needing to explicitly enable the feature in their browser settings. Besides Google Chrome, Firefox Nightly supports Wasm SIMD by default.

    2. Multi-threading
      Nearly all modern processors have multiple cores, each of which is able to carry out instructions independently and concurrently. WebAssembly programs can spread their work across cores via the threads proposal for performance. This proposal allows multiple Wasm instances in separate web workers to share a single WebAssembly.Memory object for fast communication between workers.

      Wasm threads is a phase 2 proposal, and has been available in Chrome desktop by default since version 74. There is an ongoing cross-browser effort to enable this functionality for mobile devices as well.
    To see which browsers support SIMD, threads, and other experimental features, check out the WebAssembly roadmap.

    Other improvements

    Since the original launch of our Wasm backend in March, we have extended operator coverage and now support over 70 operators. Many of the new operators are accelerated through the XNNPACK library, and unlock support for additional models, like the HandPose model.

    Looking ahead

    We expect the performance of our Wasm backend to keep improving. We’re closely following the progress of several evolving specifications in WebAssembly, including flexible vectorsfor wider SIMD, quasi fused multiply-add, and pseudo-minimum and maximum instructions. We’re also looking forward to ES6 module support for WebAssembly modules. As with SIMD and multithreading, we intend to take advantage of these features as they become available with no implications for TF.js user code.

    More information

    Acknowledgements

    We would like to thank Daniel Smilkov and Nikhil Thorat for laying the groundwork of the WebAssembly backend and the integration with XNNPACK, Matsvei Zhdanovich for collecting Pixel 4 benchmark numbers, and Frank Barchard for implementing low-level Wasm SIMD optimizations in XNNPACK.
    Sours: https://blog.tensorflow.org/2020/09/supercharging-tensorflowjs-webassembly.html

    Wasm tensorflow

    npm

    This package adds a WebAssembly backend to TensorFlow.js. It currently supports the following models from our models repo:

    • BlazeFace
    • BodyPix
    • CocoSSD
    • Face landmarks detection
    • HandPose
    • KNN classifier
    • MobileNet
    • PoseDetection
    • Q&A
    • AutoML Image classification
    • AutoML Object detection

    Importing the backend

    Via NPM

    // Import @tensorflow/tfjs or @tensorflow/tfjs-coreimport*astffrom'@tensorflow/tfjs';// Adds the WASM backend to the global backend registry.import'@tensorflow/tfjs-backend-wasm';// Set the backend to WASM and wait for the module to be ready.tf.setBackend('wasm').then(()=>main());

    Via a script tag

    <!-- Import @tensorflow/tfjs or @tensorflow/tfjs-core --><scriptsrc="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs"></script><!-- Adds the WASM backend to the global backend registry --><scriptsrc="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs-backend-wasm/dist/tf-backend-wasm.js"></script><script>tf.setBackend('wasm').then(()=>main());</script>

    Setting up cross-origin isolation

    Starting from Chrome 92 (to be released around July 2021), cross-origin isolation needs to be set up in your site in order to take advantage of the multi-threading support in WASM backend. Without this, the backend will fallback to the WASM binary with SIMD-only support (or the vanila version if SIMD is not enabled). Without multi-threading support, certain models might not achieve the best performance.

    Here are the high-level steps to set up the cross-origin isolation. You can learn more about this topic here.

    1. Send the following two HTTP headers when your main document (e.g.index.html) that uses the WASM backend is served. You may need to configure or ask your web host provider to enable these headers.

      • If you are loading the WASM backend from through the script tag, you are good to go. No more steps are needed.

        If you are loading the WASM backend from your own or other third-party servers, you need to make sure the script is served with either CORS or CORP header.

        • CORS header: . In addition, you will also need to add the "crossorigin" attribute to your script tags.

        • CORP header:

          • If the resource is loaded from the same origin as your main site (e.g. main site: mysite.com/, script: mysite.com/script.js), set:

          • If the resource is loaded from the same site but cross origin (e.g. main site: mysite.com/, script: static.mysite.com:8080/script.js), set:

          • If the resource is loaded from the cross origin(s) (e.g. main site: mysite.com/, script: mystatic.com/script.js), set:

      If the steps above are correctly done, you can check the Network tab from the console and make sure the WASM binary is loaded.

      Running MobileNet

      asyncfunctionmain(){letimg=tf.browser.fromPixels(document.getElementById('img')).resizeBilinear([224,224]).expandDims(0).toFloat();letmodel=awaittf.loadGraphModel('https://tfhub.dev/google/imagenet/mobilenet_v2_100_224/classification/2',{fromTFHub: true});consty=model.predict(img);y.print();}main();

      Our WASM backend builds on top of the XNNPACK library which provides high-efficiency floating-point neural network inference operators.

      Using bundlers

      The shipped library on NPM consists of 2 files:

      • the main js file (bundled js for browsers)
      • the WebAssembly binary in

      There is a proposal to add WASM support for ES6 modules. In the meantime, we have to manually read the wasm file. When the WASM backend is initialized, we make a / for relative from the main js file. This means that bundlers such as Parcel and WebPack need to be able to serve the file in production. See starter/parcel and starter/webpack for how to setup your favorite bundler.

      If you are serving the files from a different directory, call with the location of that directory before you initialize the backend:

      import{setWasmPaths}from'@tensorflow/tfjs-backend-wasm';// setWasmPaths accepts a `prefixOrFileMap` argument which can be either a// string or an object. If passing in a string, this indicates the path to// the directory where your WASM binaries are located.setWasmPaths('www.yourdomain.com/');tf.setBackend('wasm').then(()=>{...});

      If the WASM backend is imported through tag, needs to be called on the object:

      tf.wasm.setWasmPaths('www.yourdomain.com/');

      Note that if you call with a string, it will be used to load each binary (SIMD-enabled, threading-enabled, etc.) Alternatively you can specify overrides for individual WASM binaries via a file map object. This is also helpful in case your binaries have been renamed.

      For example:

      import{setWasmPaths}from'@tensorflow/tfjs-backend-wasm';setWasmPaths({'tfjs-backend-wasm.wasm': 'www.yourdomain.com/renamed.wasm','tfjs-backend-wasm-simd.wasm': 'www.yourdomain.com/renamed-simd.wasm','tfjs-backend-wasm-threaded-simd.wasm': 'www.yourdomain.com/renamed-threaded-simd.wasm'});tf.setBackend('wasm').then(()=>{...});

      If you are using a platform that does not support fetch directly, please set the optional argument to :

      import{setWasmPath}from'@tensorflow/tfjs-backend-wasm';constusePlatformFetch=true;setWasmPaths(yourCustomPathPrefix,usePlatformFetch);tf.setBackend('wasm').then(()=>{...});

      JS Minification

      If your bundler is capable of minifying JS code, please turn off the option that transforms into . For example, in terser, the option is called "typeofs" (located under the Compress options section). Without this feature turned off, the minified code will throw "_scriptDir is not defined" error from web workers when running in browsers with SIMD+multi-threading support.

      Use with Angular

      If you see the error when building your Angular app, make sure to add to the field in your (or ):

      By default, the generated Angular app sets this field to an empty array which will prevent the Angular compiler from automatically adding "global types" (such as ) defined in files to your app.

      Benchmarks

      The benchmarks below show inference times (ms) for two different edge-friendly models: MobileNet V2 (a medium-sized model) and Face Detector (a lite model). All the benchmarks were run in Chrome 79.0 using this benchmark page across our three backends: Plain JS (CPU), WebGL and WASM. Inference times are averaged across 200 runs.

      MobileNet V2

      MobileNet is a medium-sized model with 3.48M params and ~300M multiply-adds. For this model, the WASM backend is between ~3X-11.5X faster than the plain JS backend, and ~5.3-7.7X slower than the WebGL backend.

      MobileNet inference (ms)WASMWebGLPlain JSWASM + SIMDWASM + SIMD + threads
      iPhone X147.120.3941.3N/AN/A
      iPhone XS14018.1426.4N/AN/A
      Pixel 418276.4162882N/A
      ThinkPad X1 Gen6 w/Linux122.744.81489.434.612.4
      Desktop Windows123.141.6111737.2N/A
      Macbook Pro 15 201998.419.6893.530.210.3
      Node v.14 on Macbook Pro290N/A1404.364.2N/A

      Face Detector

      Face detector is a lite model with 0.1M params and ~20M multiply-adds. For this model, the WASM backend is between ~8.2-19.8X faster than the plain JS backend and comparable to the WebGL backend (up to ~1.7X faster, or 2X slower, depending on the device).

      Face Detector inference (ms)WASMWebGLPlain JSWASM + SIMDWASM + SIMD + threads
      iPhone X22.413.5318N/AN/A
      iPhone XS21.410.5176.9N/AN/A
      Pixel 4282836815.9N/A
      Desktop Linux12.612.7249.58.06.2
      Desktop Windows16.27.1270.97.5N/A
      Macbook Pro 15 201913.622.7209.17.94.0

      When should I use the WASM backend?

      You should always try to use the WASM backend over the plain JS backend since it is strictly faster on all devices, across all model sizes. Compared to the WebGL backend, the WASM backend has better numerical stability, and wider device support. Performance-wise, our benchmarks show that:

      • For medium-sized models (~100-500M multiply-adds), the WASM backend is several times slower than the WebGL backend.
      • For lite models (~20-60M multiply-adds), the WASM backend has comparable performance to the WebGL backend (see the Face Detector model above).

      We are committed to supporting the WASM backend and will continue to improve performance. We plan to follow the WebAssembly standard closely and benefit from its upcoming features such as multi-threading.

      How many ops have you implemented?

      See for an up-to-date list of supported ops. We love contributions. See the contributing document for more info.

      Do you support training?

      Maybe. There are still a decent number of ops that we are missing in WASM that are needed for gradient computation. At this point we are focused on making inference as fast as possible.

      Do you work in node?

      Yes. If you run into issues, please let us know.

      Do you support SIMD and multi-threading?

      Yes. We take advantage of SIMD and multi-threading wherever they are supported by testing the capabilities of your runtime and loading the appropriate WASM binary. If you intend to serve the WASM binaries from a custom location (via ), please note that the SIMD-enabled and threading-enabled binaries are separate from the regular binary.

      How do I give feedback?

      We'd love your feedback as we develop this backend! Please file an issue here.

      Emscripten installation

      The Emscripten installation necessary to build the WASM backend is managed automatically by the Bazel Emscripten Toolchain.

      Building

      Testing

      Deployment

      ./scripts/build-npm.sh npm publish
      Sours: https://www.npmjs.com/package/@tensorflow/tfjs-backend-wasm
      Build Your Own WebAssembly Compiler

      Usage

      This package adds a WebAssembly backend to TensorFlow.js. It currently supports the following models from our models repo:

      • BlazeFace
      • BodyPix
      • CocoSSD
      • Face landmarks detection
      • HandPose
      • KNN classifier
      • MobileNet
      • PoseDetection
      • Q&A
      • AutoML Image classification
      • AutoML Object detection

      Importing the backend

      Via NPM

      // Import @tensorflow/tfjs or @tensorflow/tfjs-coreimport*astffrom'@tensorflow/tfjs';// Adds the WASM backend to the global backend registry.import'@tensorflow/tfjs-backend-wasm';// Set the backend to WASM and wait for the module to be ready.tf.setBackend('wasm').then(()=>main());

      Via a script tag

      <!-- Import @tensorflow/tfjs or @tensorflow/tfjs-core --><scriptsrc="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs"></script><!-- Adds the WASM backend to the global backend registry --><scriptsrc="https://cdn.jsdelivr.net/npm/@tensorflow/tfjs-backend-wasm/dist/tf-backend-wasm.js"></script><script>tf.setBackend('wasm').then(()=>main());</script>

      Setting up cross-origin isolation

      Starting from Chrome 92 (to be released around July 2021), cross-origin isolation needs to be set up in your site in order to take advantage of the multi-threading support in WASM backend. Without this, the backend will fallback to the WASM binary with SIMD-only support (or the vanila version if SIMD is not enabled). Without multi-threading support, certain models might not achieve the best performance.

      Here are the high-level steps to set up the cross-origin isolation. You can learn more about this topic here.

      1. Send the following two HTTP headers when your main document (e.g.index.html) that uses the WASM backend is served. You may need to configure or ask your web host provider to enable these headers.

        • If you are loading the WASM backend from through the script tag, you are good to go. No more steps are needed.

          If you are loading the WASM backend from your own or other third-party servers, you need to make sure the script is served with either CORS or CORP header.

          • CORS header: . In addition, you will also need to add the "crossorigin" attribute to your script tags.

          • CORP header:

            • If the resource is loaded from the same origin as your main site (e.g. main site: mysite.com/, script: mysite.com/script.js), set:

            • If the resource is loaded from the same site but cross origin (e.g. main site: mysite.com/, script: static.mysite.com:8080/script.js), set:

            • If the resource is loaded from the cross origin(s) (e.g. main site: mysite.com/, script: mystatic.com/script.js), set:

        If the steps above are correctly done, you can check the Network tab from the console and make sure the WASM binary is loaded.

        Threads count

        By default, the backend will use the number of logical CPU cores as the threads count when creating the threadpool used by XNNPACK. You can use the API to manually set it (must be called before calling ). API can be used to get the actual number of threads being used (must be called after the WASM backend is initialized).

        Via NPM

        import*astffrom'@tensorflow/tfjs';import{getThreadsCount,setThreadsCount}from'@tensorflow/tfjs-backend-wasm';setThreadsCount(2);tf.setBackend('wasm').then(()=>{console.log(getThreadsCount());});

        Via script tag

        tf.wasm.setThreadsCount(2);tf.setBackend('wasm').then(()=>{consosle.log(tf.wasm.getThreadsCount());});

        Running MobileNet

        asyncfunctionmain(){letimg=tf.browser.fromPixels(document.getElementById('img')).resizeBilinear([224,224]).expandDims(0).toFloat();letmodel=awaittf.loadGraphModel('https://tfhub.dev/google/imagenet/mobilenet_v2_100_224/classification/2',{fromTFHub: true});consty=model.predict(img);y.print();}main();

        Our WASM backend builds on top of the XNNPACK library which provides high-efficiency floating-point neural network inference operators.

        Using bundlers

        The shipped library on NPM consists of 2 files:

        • the main js file (bundled js for browsers)
        • the WebAssembly binary in

        There is a proposal to add WASM support for ES6 modules. In the meantime, we have to manually read the wasm file. When the WASM backend is initialized, we make a / for relative from the main js file. This means that bundlers such as Parcel and WebPack need to be able to serve the file in production. See starter/parcel and starter/webpack for how to setup your favorite bundler.

        If you are serving the files from a different directory, call with the location of that directory before you initialize the backend:

        import{setWasmPaths}from'@tensorflow/tfjs-backend-wasm';// setWasmPaths accepts a `prefixOrFileMap` argument which can be either a// string or an object. If passing in a string, this indicates the path to// the directory where your WASM binaries are located.setWasmPaths('www.yourdomain.com/');tf.setBackend('wasm').then(()=>{...});

        If the WASM backend is imported through tag, needs to be called on the object:

        tf.wasm.setWasmPaths('www.yourdomain.com/');

        Note that if you call with a string, it will be used to load each binary (SIMD-enabled, threading-enabled, etc.) Alternatively you can specify overrides for individual WASM binaries via a file map object. This is also helpful in case your binaries have been renamed.

        For example:

        import{setWasmPaths}from'@tensorflow/tfjs-backend-wasm';setWasmPaths({'tfjs-backend-wasm.wasm': 'www.yourdomain.com/renamed.wasm','tfjs-backend-wasm-simd.wasm': 'www.yourdomain.com/renamed-simd.wasm','tfjs-backend-wasm-threaded-simd.wasm': 'www.yourdomain.com/renamed-threaded-simd.wasm'});tf.setBackend('wasm').then(()=>{...});

        If you are using a platform that does not support fetch directly, please set the optional argument to :

        import{setWasmPath}from'@tensorflow/tfjs-backend-wasm';constusePlatformFetch=true;setWasmPaths(yourCustomPathPrefix,usePlatformFetch);tf.setBackend('wasm').then(()=>{...});

        JS Minification

        If your bundler is capable of minifying JS code, please turn off the option that transforms into . For example, in terser, the option is called "typeofs" (located under the Compress options section). Without this feature turned off, the minified code will throw "_scriptDir is not defined" error from web workers when running in browsers with SIMD+multi-threading support.

        Use with Angular

        If you see the error when building your Angular app, make sure to add to the field in your (or ):

        By default, the generated Angular app sets this field to an empty array which will prevent the Angular compiler from automatically adding "global types" (such as ) defined in files to your app.

        Benchmarks

        The benchmarks below show inference times (ms) for two different edge-friendly models: MobileNet V2 (a medium-sized model) and Face Detector (a lite model). All the benchmarks were run in Chrome 79.0 using this benchmark page across our three backends: Plain JS (CPU), WebGL and WASM. Inference times are averaged across 200 runs.

        MobileNet V2

        MobileNet is a medium-sized model with 3.48M params and ~300M multiply-adds. For this model, the WASM backend is between ~3X-11.5X faster than the plain JS backend, and ~5.3-7.7X slower than the WebGL backend.

        MobileNet inference (ms)WASMWebGLPlain JSWASM + SIMDWASM + SIMD + threads
        iPhone X147.120.3941.3N/AN/A
        iPhone XS14018.1426.4N/AN/A
        Pixel 418276.4162882N/A
        ThinkPad X1 Gen6 w/Linux122.744.81489.434.612.4
        Desktop Windows123.141.6111737.2N/A
        Macbook Pro 15 201998.419.6893.530.210.3
        Node v.14 on Macbook Pro290N/A1404.364.2N/A

        Face Detector

        Face detector is a lite model with 0.1M params and ~20M multiply-adds. For this model, the WASM backend is between ~8.2-19.8X faster than the plain JS backend and comparable to the WebGL backend (up to ~1.7X faster, or 2X slower, depending on the device).

        Face Detector inference (ms)WASMWebGLPlain JSWASM + SIMDWASM + SIMD + threads
        iPhone X22.413.5318N/AN/A
        iPhone XS21.410.5176.9N/AN/A
        Pixel 4282836815.9N/A
        Desktop Linux12.612.7249.58.06.2
        Desktop Windows16.27.1270.97.5N/A
        Macbook Pro 15 201913.622.7209.17.94.0

        When should I use the WASM backend?

        You should always try to use the WASM backend over the plain JS backend since it is strictly faster on all devices, across all model sizes. Compared to the WebGL backend, the WASM backend has better numerical stability, and wider device support. Performance-wise, our benchmarks show that:

        • For medium-sized models (~100-500M multiply-adds), the WASM backend is several times slower than the WebGL backend.
        • For lite models (~20-60M multiply-adds), the WASM backend has comparable performance to the WebGL backend (see the Face Detector model above).

        We are committed to supporting the WASM backend and will continue to improve performance. We plan to follow the WebAssembly standard closely and benefit from its upcoming features such as multi-threading.

        How many ops have you implemented?

        See for an up-to-date list of supported ops. We love contributions. See the contributing document for more info.

        Do you support training?

        Maybe. There are still a decent number of ops that we are missing in WASM that are needed for gradient computation. At this point we are focused on making inference as fast as possible.

        Do you work in node?

        Yes. If you run into issues, please let us know.

        Do you support SIMD and multi-threading?

        Yes. We take advantage of SIMD and multi-threading wherever they are supported by testing the capabilities of your runtime and loading the appropriate WASM binary. If you intend to serve the WASM binaries from a custom location (via ), please note that the SIMD-enabled and threading-enabled binaries are separate from the regular binary.

        How do I give feedback?

        We'd love your feedback as we develop this backend! Please file an issue here.

        Emscripten installation

        The Emscripten installation necessary to build the WASM backend is managed automatically by the Bazel Emscripten Toolchain.

        Building

        Testing

        Deployment

        ./scripts/build-npm.sh npm publish
        Sours: https://github.com/tensorflow/tfjs/tree/master/tfjs-backend-wasm

        You will also be interested:

        A WASI-like extension for Tensorflow

        AI inference is a computationally intensive task that could benefit greatly from the speed of Rust and WebAssembly. However, the standard WebAssembly sandbox provides very limited access to the native OS and hardware, such as multi-core CPUs, GPU and specialized AI inference chips. It is not ideal for the AI workload.

        The popular WebAssembly System Interface (WASI) provides a design pattern for sandboxed WebAssembly programs to securely access native host functions. The WasmEdge Runtime extends the WASI model to support access to native Tensorflow libraries from WebAssembly programs. It provides the security, portability, and ease-of-use of WebAssembly and native speed for Tensorflow.

        Table of contents

        Prerequisite

        You need to install WasmEdge and Rust.

        Build

        Run

        The utility is the WasmEdge build that includes the Tensorflow and Tensorflow Lite extensions.

        Make it run faster

        To make Tensorflow inference run much faster, you could AOT compile it down to machine native code, and then use WasmEdge sandbox to run the native code.

        Code walkthrough

        It is fairly straightforward to use the WasmEdge Tensorflow API. You can see the entire source code in main.rs.

        First, it reads the trained TFLite model file (ImageNet) and its label file. The label file maps numeric output from the model to English names for the classified objects.

        Next, it reads the image from and converts it to the size and RGB pixel arrangement required by the Tensorflow Lite model.

        Then, the program runs the TFLite model with its required input tensor (i.e., the flat image in this case), and receives the model output. In this case, the model output is an array of numbers. Each number corresponds to the probability of an object name in the label text file.

        Let's find the object with the highest probability, and then look up the name in the labels file.

        Finally, it prints the result to .

        Prerequisite

        You need to install WasmEdge. You also need the QuickJS interpreter for WasmEdge. It is the file in the WasmEdge repo.

        You can build you own from the wasmedge-quickjs project.

        Run

        The utility is the WasmEdge build that includes the Tensorflow and Tensorflow Lite extensions.

        Code walkthrough

        It is fairly straightforward to use the WasmEdge JavaScript Tensorflow API. You can see the entire source code in main.js.

        First, it reads the image from a file and converts it to the size and RGB pixel arrangement required by the Tensorflow Lite model.

        Then, the program runs the TFLite model with its required input tensor (i.e., the pixel image in this case), and receives the model output. In this case, the model output is an array of numbers. Each number corresponds to the probability of an object name in the label text file.

        Let's find the object with the highest probability, and then look up the name in the labels file.

        Finally, it prints the result to the console.

        All the tutorials below use the WasmEdge Rust SDK for Tensorflow to create AI inference functions. Those Rust functions are then compiled to WebAssembly and deployed together with WasmEdge on the cloud. If you are not familar with Rust, you can try our experimental AI inference DSL.

        Serverless functions

        The following tutorials showcase how to deploy WebAssembly programs (written in Rust) on public cloud serverless platforms. The WasmEdge Runtime runs inside a Docker container on those platforms. Each serverless platform provides APIs to get data into and out of the WasmEdge runtime through STDIN and STDOUT.

        Second Sate FaaS and Node.js

        The following tutorials showcase how to deploy WebAssembly functions (written in Rust) on the Second State FaaS. Since the FaaS service is running on Node.js, you can follow the same tutorials for running those functions in your own Node.js server.

        Service mesh

        The following tutorials showcase how to deploy WebAssembly functions and programs (written in Rust) as sidecar microservices.

        • The Dapr template shows how to build and deploy Dapr sidecars in Go and Rust languages. The sidecars then use the WasmEdge SDK to start WebAssembly programs to process workloads to the microservices.

        Data streaming framework

        The following tutorials showcase how to deploy WebAssembly functions (written in Rust) as embedded handler functions in data streaming frameworks for AIoT.

        • The YoMo template starts the WasmEdge Runtime to process image data as the data streams in from a camera in a smart factory.
        Sours: https://www.secondstate.io/articles/wasi-tensorflow/


        12959 12960 12961 12962 12963