Prepare for your exams
Get points
Guidelines and tips

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

Performance Comparison of Different Exponentiation Methods in Python, Summaries of Advanced Computer Programming

Greater Brighton Metropolitan College Advanced Computer Programming

The performance differences between various ways of expressing exponentiation in Python, including multiplication and the math.pow() function. The author uses Python's timeit and disassemblers to analyze the bytecode and timing of each method. The document also discusses the implementation of Python's BINARY_MULTIPLY and BINARY_POWER functions and their performance for small and large numbers.

What you will learn

How does Python's BINARY_MULTIPLY and BINARY_POWER functions differ in their implementation and performance?
How does the performance of exponentiation in Python change as the power value increases?
What is the role of Python's timeit and disassemblers in analyzing the performance of different Python expressions?

At what point does chained multiplication become less efficient than exponentiation in Python?

Typology: Summaries

2021/2022

Uploaded on 09/27/2022

queenmary 🇬🇧

4.6

(15)

218 documents

1 / 12

This page cannot be seen from the preview

Don't miss anything!

•

Timing Tests

Expression Disassembly

Multiplication

math.pow()

Exponentiation

BINARY_MULTIPLY versus BINARY_POWER

BINARY_MULTIPLY

BINARY_POWER

Charting Performance Differences

Generating Functions

math.pow() and Exponentiation

Chained Multiplication

Finding the Crossover

Charting the Performance

More Performance Testing

Conclusions

Recently, I was writing an algorithm to solve a coding challenge that involved

finding a point in a Cartesian plane that had the minimum distance from all of the

other points. In Python, the distance function would be expressed as

math.sqrt(dx ** 2 + dy ** 2) . However, there are several different ways to

express each term: dx ** 2 , math.pow(dx, 2) , and dx * dx . Interestingly,

these all perform differently, and I wanted to understand how and why.

Timing Tests

Python provides a module called timeit to test performance, which makes testing

these timings rather simple. With x set to 2, we can run timing tests on all three of

our options above:

Expression Timing (100k iterations)

x * x 3.87 ms

x ** 2 80.97 ms

Partial preview of the text

Download Performance Comparison of Different Exponentiation Methods in Python and more Summaries Advanced Computer Programming in PDF only on Docsity!

Timing Tests

Expression Disassembly

Multiplication

math.pow()

Exponentiation

BINARY_MULTIPLY versus BINARY_POWER

BINARY_MULTIPLY

BINARY_POWER

Charting Performance Differences

Generating Functions

math.pow() and Exponentiation

Chained Multiplication

Finding the Crossover

Charting the Performance

More Performance Testing

Conclusions

Recently, I was writing an algorithm to solve a coding challenge that involved

finding a point in a Cartesian plane that had the minimum distance from all of the

other points. In Python, the distance function would be expressed as

math.sqrt(dx 2 + dy 2). However, there are several different ways to

express each term: dx ** 2 , math.pow(dx, 2) , and dx * dx. Interestingly,

these all perform differently, and I wanted to understand how and why.

Timing Tests

Python provides a module called timeit to test performance, which makes testing

these timings rather simple. With x set to 2 , we can run timing tests on all three of

our options above:

Expression Timing (100k iterations) x * x 3.87 ms x ** 2 80.97 ms

math.pow(x, 2) 83.60 ms

Expression Disassembly

Python also provides a model called dis that disassembles code so we can see

what each of these expressions are doing under the hood, which helps in

understanding the performance differences.

Multiplication

Using dis.dis(lambda x: x * x) , we can see that the following code gets

executed:

0 LOAD_FAST 0 (x) 2 LOAD_FAST 0 (x) 4 BINARY_MULTIPLY 6 RETURN_VALUE

The program loads x twice, runs BINARY_MULTIPLY , and return s the value.

math.pow()

Using dis.dis(lambda x: math.pow(x, 2)) , we can see the following code gets

executed:

0 LOAD_GLOBAL 0 (math) 2 LOAD_ATTR 1 (pow) 4 LOAD_FAST 0 (x) 6 LOAD_CONST 1 (2) 8 CALL_FUNCTION 2 10 RETURN_VALUE

The math module loads from the global space, and then the pow attribute loads.

Next, both arguments are loaded and the pow function is called, which return s

the value.

if (((Py_SIZE(a) ^ Py_SIZE(b)) < 0) && z) { _PyLong_Negate(&z); if (z == NULL) return NULL; } return (PyObject *)z; }

For small numbers, this uses binary multiplication. For larger values, the function

uses Karatsuba multiplication, which is a fast multiplication algorithm for larger

numbers.

We can see how this function gets called in ceval.c :

case TARGET(BINARY_MULTIPLY): { PyObject *right = POP(); PyObject *left = TOP(); PyObject *res = PyNumber_Multiply(left, right); Py_DECREF(left); Py_DECREF(right); SET_TOP(res); if (res == NULL) goto error; DISPATCH(); }

BINARY_POWER

This function is located here in the Python source code. It also does several

interesting things:

The source code is too long to fully include, which partially explains the detrimental

performance. Here are some interesting snippets:

if (Py_SIZE(b) < 0) { /* if exponent is negative */ if (c) { PyErr_SetString(PyExc_ValueError, "pow() 2nd argument " "cannot be negative when 3rd argument

specified"); goto Error; } else { /* else return a float. This works because we know that this calls float_pow() which converts its arguments to double. */ Py_DECREF(a); Py_DECREF(b); return PyFloat_Type.tp_as_number->nb_power(v, w, x); } }

After creating some pointers, the function checks if the power given is a float or is

negative, where it either errors or calls a different function to handle

exponentiation.

If neither cases hit, it checks for a third argument, which is always None according

to ceval.c :

case TARGET(BINARY_POWER): { PyObject *exp = POP(); PyObject *base = TOP(); PyObject *res = PyNumber_Power(base, exp, Py_None); Py_DECREF(base); Py_DECREF(exp); SET_TOP(res); if (res == NULL) goto error; DISPATCH(); }

Finally, the function defines two routines: REDUCE for modular reduction and MULT

for multiplication and reduction. The multiplication function uses long_mul for

both values, which is the same function used in BINARY_MULTIPLY.

#define REDUCE(X)
1

We can use the timeit library above to profile code at different values and see

how the performance changes over time.

Generating Functions

To test the performance at different power values, we need to generate some

functions.

math.pow() and Exponentiation

Since both of these are already in the Python source, all we need to do is define a

function for exponentiation we can call from inside a timeit call:

exponent = lambda base, power: base ** power

Chained Multiplication

Since this changes each time the power changes , we need to generate a new

multiplication function each time the base changes. To do this, we can generate a

string like xxx and call eval() on it to return a function:

def generate_mult_func(n): mult_steps = '*'.join(['q'] * n) func_string = f'lambda q: {mult_steps}' # Keep this so we can print later return eval(func_string), func_string

Thus, we can make a multiply function like so:

multiply, func_string = generate_mult_func(power)

If we call generate_mult_func(4) , multiply will be a lambda function that

looks like this:

lambda q: qqq*q 3

Finding the Crossover

Using the code posted here, we can determine at what point multiply becomes

less efficient than exponent.

Staring with these values:

base = 2 power = 2

We loop until the time it takes to execute 100,000 iterations of multiply is

slower than executing 100,000 iterations of exponent. Initially, here are the

timings, with math.pow() serving as a point of comparison:

Starting speeds: Multiply time 11.83 ms Exponent time 86.52 ms math.pow time 73.90 ms

When running on repl.it, Python finds the crossover in 1.2s:

Crossover found in 1.2 s: Base, power 2, 15 Multiply time 110.09 ms Exponent time 108.20 ms math.pow time 79.82 ms Multiply func lambda q: qqqqqqqqqqqqqqq

Thus, chaining multiplication together is faster until our expression gets to 2^14 ; at

2^15 exponentiation becomes faster.

Charting the Performance

Using Pandas, we can keep track of the timing at each power:

Power multiply exponent math.pow

Performance Comparison of Different Exponentiation Methods in Python, Summaries of Advanced Computer Programming

Related documents

Partial preview of the text

Download Performance Comparison of Different Exponentiation Methods in Python and more Summaries Advanced Computer Programming in PDF only on Docsity!

Timing Tests

Expression Disassembly

Multiplication

math.pow()

Exponentiation

BINARY_MULTIPLY versus BINARY_POWER

BINARY_MULTIPLY

BINARY_POWER

Charting Performance Differences

Generating Functions

math.pow() and Exponentiation

Chained Multiplication

Finding the Crossover

Charting the Performance

More Performance Testing

Conclusions

Recently, I was writing an algorithm to solve a coding challenge that involved

finding a point in a Cartesian plane that had the minimum distance from all of the

other points. In Python, the distance function would be expressed as

math.sqrt(dx ** 2 + dy ** 2). However, there are several different ways to

express each term: dx ** 2 , math.pow(dx, 2) , and dx * dx. Interestingly,

these all perform differently, and I wanted to understand how and why.

Timing Tests

Python provides a module called timeit to test performance, which makes testing

these timings rather simple. With x set to 2 , we can run timing tests on all three of

our options above:

Expression Disassembly

Python also provides a model called dis that disassembles code so we can see

what each of these expressions are doing under the hood, which helps in

understanding the performance differences.

Multiplication

Using dis.dis(lambda x: x * x) , we can see that the following code gets

executed:

The program loads x twice, runs BINARY_MULTIPLY , and return s the value.

math.pow()

Using dis.dis(lambda x: math.pow(x, 2)) , we can see the following code gets

executed:

The math module loads from the global space, and then the pow attribute loads.

Next, both arguments are loaded and the pow function is called, which return s

the value.

For small numbers, this uses binary multiplication. For larger values, the function

uses Karatsuba multiplication, which is a fast multiplication algorithm for larger

numbers.

We can see how this function gets called in ceval.c :

BINARY_POWER

This function is located here in the Python source code. It also does several

interesting things:

The source code is too long to fully include, which partially explains the detrimental

performance. Here are some interesting snippets:

After creating some pointers, the function checks if the power given is a float or is

negative, where it either errors or calls a different function to handle

exponentiation.

If neither cases hit, it checks for a third argument, which is always None according

to ceval.c :

Finally, the function defines two routines: REDUCE for modular reduction and MULT

for multiplication and reduction. The multiplication function uses long_mul for

both values, which is the same function used in BINARY_MULTIPLY.

We can use the timeit library above to profile code at different values and see

how the performance changes over time.

Generating Functions

To test the performance at different power values, we need to generate some

functions.

math.pow() and Exponentiation

Since both of these are already in the Python source, all we need to do is define a

function for exponentiation we can call from inside a timeit call:

Chained Multiplication

Since this changes each time the power changes , we need to generate a new

multiplication function each time the base changes. To do this, we can generate a

string like xxx and call eval() on it to return a function:

Thus, we can make a multiply function like so:

If we call generate_mult_func(4) , multiply will be a lambda function that

looks like this:

Finding the Crossover

Using the code posted here, we can determine at what point multiply becomes

less efficient than exponent.

Staring with these values:

math.sqrt(dx 2 + dy 2). However, there are several different ways to