%%time
import time
for _ in range(1000):
0.01) # sleep for 0.01 seconds time.sleep(
CPU times: user 24.1 ms, sys: 17.9 ms, total: 42 ms
Wall time: 11 s
Magic commands in Jupyter notebooks are handy shortcuts that can perform various helpful tasks to streamline your data analysis. Magic commands are prefixed by a %
character. They are not part of Python syntax, but are provided by IPython, the kernel that Jupyter notebooks use to execute Python code.
Magic commands are divided into two categories: - Line magics - Cell magics.
Line magics are denoted by a single %
prefix and operate on a single line of input.
For example, the %run
magic command lets you run a Python script as a program, with command line arguments passed as arguments. %run
can execute python code from .py
files - this is well-documented behavior. Lesser known is the fact that it can also execute other jupyter notebooks, which can quite useful.
Here is an example:
Assuming we have a python script file named ‘script.py’:
%run script.py
The %load
command will load the contents of a file into a code cell. This can be useful for loading scripts or modules that you’re working on. Here is an example:
Assuming we have a python script file named ‘script.py’
%load script.py
Cell magics are denoted by a double %%
prefix and operate on multiple lines of input (the entire cell).
For example, the %%writefile
magic command writes the contents of a cell to a file. This can be useful when you want to write Python code in a Jupyter cell and also save it to a .py
file.
Here is an example:
%%writefile test.py
print('Hello, World!')
The %%time
command is helpful for performance testing by returning information about the time taken by a cell to execute.
Here is an example:
Magic commands can be very useful in a data engineering context. Here are some examples of magic commands that are commonly used in data engineering.
%sql
/%%sql
These magics can be used to run SQL queries directly from a Jupyter notebook, which is handy when working with databases. You need to install the ipython-sql
extension first.
%load_ext sql
%sql postgresql://username:password@localhost:5432/dbname
%sql SELECT * FROM table_name
%matplotlib inline
This command is used to display plots inline, inside the notebook. This is useful for data visualization which is a key part of data analysis and data engineering.
%load_ext autoreload
/%autoreload 2
These commands are used to automatically reload modules before executing user code. This is useful when you are editing your Python modules and want the changes to be loaded automatically in your notebook.
%load_ext autoreload
%autoreload 2
import my_module
%history
The %history
magic command in Jupyter notebooks offers a way to see past commands that you have executed. This is very useful for many reasons such as auditing what commands were run, repeating past commands, and understanding the sequence of actions you’ve taken.
The syntax is%history [-n] [-o] [-p] [-t] [-f FILENAME] [-g [PATTERN [PATTERN ...]]] [-l [LIMIT]] [-u] [range [range ...]]
.
Typing %history on its own will display the command history. It supports a variety of parameters that provide finer control over what gets displayed:
-n
: Show line numbers for each command.
-o
: Include output for commands.
-p
: Print the history with user-friendly identifiers so it can be pasted and executed. This is the default for interactive prompts.
-t
: Print the history without any translations or preprocessing. This is useful for recreating the history in another session.
-f FILENAME
: File to output to, instead of session history.
-g
: Treat the arg as a pattern to grep for in (full) history. This includes the saved history (which is loaded on startup from ~/.python_history).
-l [LIMIT]
: Only the last n lines are to be printed.
-u
: When searching history, user variables are expanded. The syntax is the same as that of the shell.
range [range ...]
: One or more ranges of input history, e.g. “5 8-10 23”.
Here is an examples of how to use the %history
command:
1:
%%time
import time
for _ in range(1000):
time.sleep(0.01) # sleep for 0.01 seconds
2:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0, 10, 100)
plt.plot(x, np.sin(x))
3: %history -n 1-5
This command is useful for keeping track of what you have executed, particularly in long analysis processes. It can also help in creating a script from a notebook session.
In data engineering, %history
can be beneficial for reviewing the sequence of data transformations and analyses you’ve performed.