NumPy Functions Composed (2024)

Compare Fast Inverse Square Root Method to NumPy ufuncs, Numba JIT, and Cython — Which One Wins?


This article was originally published on medium.com.

Posted on behalf of: Bob Chesebrough, Solutions Architect, Intel Corporation

This article shows the efficient variations in the way we approach computingReciprocal Sqrtand these approaches can perform 3 to 4 orders of magnitude faster than older methods.

The approach you choose depends on your need foraccuracy, speed, reliability, maintainability. See which one scores highest in your needs assessment!

NumPy Functions Composed (1)

What is reciprocal sqrt?

It is a function composed of two operations:

  1. Reciprocal (Multiplicative Inverse)
  2. Square root of a value or vector

Where is it used?

Physics/ Engineering

  • Partial derivative of distance formula with respect to one dimensions, as follows:

NumPy Functions Composed (2)

Special Relativity Lorentz transformation

  • The gamma coefficient is used to compute time dilation or length contraction among other calculations in special relativity

NumPy Functions Composed (3)

3D Graphics or ML applications needing normalization in space

  • Vector normalization in video games and is mostly used in calculations involved in 3D programming — This is why the Fast Reciprocal Sqrt was invented for Quake III

This is an interesting topic, as it intersects recent history of algorithms, hardware and software library implementations. Methods for computing this quickly even as an approximation at first outstripped the instructions sets on modern x86 architecture. Two Berkeley guys,William Kahanand K.C. Ng wrote an unpublished paper in May 1986 describing how to calculate the square root using bit-fiddling techniques followed by Newton iterations. This was picked up by Cleve Moller and company of MATLAB* fame, Cleve’s associate Greg Walsh devised the now-famous constant and fast inverse square root algorithm. Read the fascinating history onWikipedia.

It is still common when searching on the web and you will find many articles and advise on using this algorithm today. This algorithm is an ingenious application of Newton-Rapson method to get fast approximations of this 1/sqrt().

Then the introduction by Intel of the Pentium III in 1999 saw a new Streaming SIMD Extensions (SSE) instruction which would compute reciprocal sqrt as part of SSE instruction set — a vectorized instruction!

What is the best algorithm now?

That all depends on perspective. The Fast Reciprocal Sqrt is a clever trick, and from what I see its accuracy is with about 1% of the actual value. SSE and Advanced Vector Extensions (AVX) are vectorized instructions that are based on IEEE floating point standards, but this is also an approximation. Its accuracy depends on the nature of the data type you choose but is typically far better than 1%. When dealing floating point calculations such as these — the order of operations and how you accumulate partial sums etc matter. So let me provide you the code and a little discussion of what I found, and you be the judge!

What I test in this notebook? (see GitHub link at the end)

In this workbook, we will test the following approaches:

PyTorch rsqrt:

  • Use torch built in function rsqrt()

NumPy_Compose RecipSqrt

  • Use NumPy np.reciprocal(np.sqrt())

NumPy_Simple

  • Use NumPy implicit vectors: b = a**-.5

Cython_Simple

  • Use Cython variant of Simple a**-.5

Numba_Simple

  • Use Numba njit variant of Simple a**-.5

BruteForceLoopExact

  • Brute force loop approach no vectorization at all

Fast Reciprocal Sqrt Newton-Raphson simple Loop

  • Fast Reciprocal Sqrt using Newton Raphson and Quake III approach

Fast Reciprocal Sqrt Newton-Raphson Vectorized

  • Fast Reciprocal Sqrt using Newton Raphson and Quake III approach vectorized with np.apply

Fast Reciprocal Sqrt Newton-Raphson Cython

  • Fast Reciprocal Sqrt using Newton Raphson and Quake III approach in Cython.

Where did the results finish?

It depends on the platform

On the new Intel® Tiber™ Developer Cloud: Ubuntu 22, I see the following (Intel® Xeon® Platinum 8480L, 224 core, 503GB RAM)

NumPy Functions Composed (4)

Testing Various algorithms and Optimizations for Inverse Square Root

My results tend to align with the observation of Doug Woo’s article, and I see orders of magnitude speedup using built in functions in NumPy and PyTorch.

Get the code for this article and the rest of the series is located onGitHub.

Next Steps

Try out this code sample using the standard free Intel Developer Cloud account and the ready-made Jupyter Notebook.

We encourage you to also check out and incorporate Intel’s other AI/ML Framework optimizations and end-to-end portfolio of tools into your AI workflow and learn about the unified, open, standards-based oneAPI programming model that forms the foundation of Intel’s AI Software Portfolio to help you prepare, build, deploy, and scale your AI solutions.

Intel Developer Cloud System Configuration as tested:

x86_64, CPU op-mode(s): 32-bit, 64-bit, Address sizes: 52 bits physical, 57 bits virtual, Byte Order: Little Endian, CPU(s): 224, On-line CPU(s) list: 0–223, Vendor ID: GenuineIntel, Model name: Intel® Xeon® Platinum 8480+, CPU family: 6, Model: 143, Thread(s) per core: 2, Core(s) per socket: 56, Socket(s): 2, Stepping: 8, CPU max MHz: 3800.0000, CPU min MHz: 800.0000

NumPy Functions Composed (2024)

FAQs

How to compose functions in Python? ›

Function Composition in Python – FAQs
  1. In Python, you can manually compose functions by calling one function with the result of another. ...
  2. Handling multiple arguments in function composition can be tricky since each function in the composition chain typically expects a single input from the output of the previous function.
Jul 10, 2024

What functions does NumPy have? ›

Common NumPy Array Functions
Array OperationsFunctions
Array Manipulation Functionsnp.reshape() , np.transpose() , etc.
Array Mathematical Functionsnp.add() , np.subtract() , np.sqrt() , np.power() , etc.
Array Statistical Functionsnp.median() , np.mean() , np.std() , and np.var() .
2 more rows

What are the contents of NumPy? ›

The NumPy library contains multidimensional array data structures, such as the hom*ogeneous, N-dimensional ndarray , and a large library of functions that operate efficiently on these data structures. Learn more about NumPy at What is NumPy, and if you have comments or suggestions, please reach out!

What is the full function in NumPy? ›

full() function is used to return a new array of a given shape and data type filled with fill_value .

How do you compose a function? ›

In mathematics, function composition is an operation ∘ that takes two functions f and g, and produces a function h = g ∘ f such that h(x) = g(f(x)). In this operation, the function g is applied to the result of applying the function f to x.

How do you structure a function in Python? ›

The four steps to defining a function in Python are the following: Use the keyword def to declare the function and follow this up with the function name. Add parameters to the function: they should be within the parentheses of the function. End your line with a colon.

Why is NumPy faster than lists? ›

NumPy arrays are stored at one continuous place in memory unlike lists, so processes can access and manipulate them very efficiently. This behavior is called locality of reference in computer science. This is the main reason why NumPy is faster than lists.

What is the basic data structure in NumPy? ›

NumPy is a Python library that can be used for scientific and numerical applications and is the tool to use for linear algebra operations. The main data structure in NumPy is the ndarray, which is a shorthand name for N-dimensional array. When working with NumPy, data in an ndarray is simply referred to as an array.

What is the most important object in NumPy? ›

The most important object defined in NumPy is an N-dimensional array type called ndarray.

What are the universal functions in NumPy? ›

A universal function (or ufunc for short) is a function that operates on ndarrays in an element-by-element fashion, supporting array broadcasting, type casting, and several other standard features.

What is the real function in NumPy? ›

The real() function in NumPy is used to return the real part of the complex argument that is passed to it.

What are the NumPy summary functions? ›

Common NumPy Statistical Functions
FunctionsDescriptions
mean()return the mean of an array
std()return the standard deviation of an array
percentile()return the nth percentile of elements in an array
min()return the minimum element of an array
2 more rows

How do you write a function Python? ›

Creating a Function in Python. When declaring a function in Python, the 'def' keyword must come first, then the function name, any parameters in parenthesis, and then a colon. The code that needs to be run is indented in the function body. The 'return' statement is optional for a function to return a value.

How to construct function in Python? ›

Creating a Constructor in Python and Syntax
  1. class ClassName: Define a class with the class name.
  2. def __init__(self, parameter1, parameter2, ...): Define the constructor method with __init__. ...
  3. # Constructor code here: Add code within the constructor to initialize attributes or perform other setup tasks.

How to combine functions in Python? ›

In the simplest case, to combine functions you would take the output from one function and use it as the input to another function. I am going to take a very simple example: The first function f() calculates the square of its input. The second function g() doubles its input.

Top Articles
Latest Posts
Article information

Author: Roderick King

Last Updated:

Views: 5581

Rating: 4 / 5 (51 voted)

Reviews: 82% of readers found this page helpful

Author information

Name: Roderick King

Birthday: 1997-10-09

Address: 3782 Madge Knoll, East Dudley, MA 63913

Phone: +2521695290067

Job: Customer Sales Coordinator

Hobby: Gunsmithing, Embroidery, Parkour, Kitesurfing, Rock climbing, Sand art, Beekeeping

Introduction: My name is Roderick King, I am a cute, splendid, excited, perfect, gentle, funny, vivacious person who loves writing and wants to share my knowledge and understanding with you.