PythonScript

Tag: Integration

The PythonScript block integrates a Python script into a workflow. The script is stored within the block and may be either edited directly or imported from a file on disk.

PythonScript configuration allows to create block variables that are used to read and write port data in the script. Adding a variable creates ports with the same name — input, output, or a pair of both. Block variables are associated with script variables: their names become the names of global variables in the script’s scope.

The block starts only after it receives values to all input ports. These values are assigned to variables associated with input ports, including those associated with an input-output port pair. A variable that is associated only with an output port is assigned its default value on block start (the default value of variable is actually a value assigned to port; see Ports). If there is no value assigned to an output port, the associated variable remains uninitialized in the script.

After the script executes, final values of all variables that have an associated output port are sent to these ports. If the variable was not initialized in the script, the block will raise an error that is then processed according to the block’s error handling behavior (see the Error handling behavior option).

Variables associated with ports are reinitialized on every block startup. Other (unassociated) variables are temporary unless you have the Keep globals option enabled (see option description for details).

Options

Error handling behavior

Common option that controls the block’s behavior when it encounters an error during execution. See section Error Handling for details.


Keep globals

Controls the lifespan of global variables created (assigned) in the script but not associated with block ports (not defined in the block configuration). If disabled (default): global script variables are deleted and their values are lost when the script finishes. If enabled: global variables are not deleted if they were assigned in the script; their values are kept in the block memory, making them available on the next startup.

Value:Boolean
Default:off

Enable this option if you want to add a steps counter to your Python script:

try:
  step_index += 1
except NameError:
  step_index = 0

Here, on the first step step_index variable doesn’t exist and its usage causes NameError exception. Then it’s catched and step_index is initialized with 0. If Keep globals is enabled, step_index variable is preserved between steps, so on the next steps NameError isn’t thrown and step_index is incremented. Note that this code won’t work if there is a block variable named step_index.

Notes

This section answers some specific questions on using the PythonScript block.

Environment Settings

PythonScript supports per-block environment settings. To set up the environment, open the block’s configuration dialog and switch to the Environment tab.

../_images/page_blocks_PythonScript_env.png

For details on using block environment settings, see section Environment.

Available Modules

PythonScript does not require you to install Python because it uses pSeven’s in-built Python interpreter (also, this in-built interpreter is used when you run a Python script in Workspace).

pSeven’s Python distribution includes a number of useful modules which you can import in a script:

  • Scientific computing: numpy, pandas, scipy, sympy, scikit-learn, and networkx.
  • Plotting: bokeh and matplotlib with cycler.
  • Excel support: openpyxl, xlrd and xlwt (note that pSeven also provides the Excel integration block).
  • PyWin32 modules — see PyWin32 Documentation.
  • Markup and parsing: et_xmlfile, fortranformat, markupsafe, pyparsing, whoosh, and yaml.
  • Networking: paramiko, requests, and tornado with tornado_json.
  • Cryptography: Crypto (PyCrypto) and ecdsa.
  • Various utilities: astor, decorator, and six; psutil and lockfile; dateutil, jdcal, and pytz.

Note that this section lists only names of modules that can be imported in user scripts. The list does not include names of all additional packages installed in pSeven’s Python; if interested, you can find them in section Open Source Components.

Importing Third-Party Modules

New in version 6.6: PythonScript now automatically adds its working directory to sys.path, allowing to import modules placed in in the block’s sandbox.

In addition to the modules available from pSeven by default, PythonScript also supports importing third-party modules. This is possible for pure Python modules — that is, the module should not include compiled code.

Two steps are required in order to make a module available for import:

  1. Set the block’s sandbox directory (see section Sandbox for details).
  2. Copy the module into the directory you specified.

PythonScript automatically adds its working directory (which is the sandbox) to sys.path, so after copying your module there, you can import it in the script as usual.

Opening Files in PythonScript

This question arises when for some reason you do not want to use an absolute file path. For files with absolute paths, PythonScript allows using the Python’s open() builtin. The disadvantage of using an absolute path is that the project becomes not portable — if you move it to another host or another directory, PythonScript may be not able to find the file because its path will be different.

Note

If you are going to create a custom parser or generator (that is, you want to open a text file and do something with its contents), consider using the Text block. This block automates file opening, and supports Python scripting, too.

If you want the project to be portable, then common solutions are:

  • The file is generated by another block, but is not output to a port.

    Typical case is when an external executable launched by a Program block does not accept the output file name as a command line parameter, and its output does not contain the needed information so redirecting it to a file is of no use. In this situation you will have to configure both blocks (the one generating the file and the PythonScript block) so that they share the same sandbox — this is done in block configuration dialog on the Sandbox tab. The sandbox is the block’s working directory, so the generated file appears there when the Program block finishes. In the Python script, just use a relative path in open(): script working directory is the PythonScript block’s sandbox, so a relative path is interpreted as a path inside the sandbox. This works also if you add a StringScalar input port to the PythonScript block and send the filename to it.

  • The file has to be created somewhere inside the project directory (or already exists there), and you want to use a path relative to the project.

    In this case, you should add a File block which creates a file object, and send this object to PythonScript. Configure the File block as described in FAQ: How do the files of project origin work with absolute and relative paths?, then add an input port of the File type to your PythonScript block and connect it with the file output port of the File block. See the next case for details on how to work with this file in your script.

  • The file comes from another block’s port.

    First, you need to add a new input port to your PythonScript block. Set its type to File, so the block knows that this port receives a file. Let us assume you have added this port and named it myfile (and linked it to the output port that sends the file, of course). As soon as you add a port to PythonScript, its name becomes an object name in the script scope, so you now have a myfile object in your script. This object is a pSeven file-like: it is somewhat different from the Python’s file-like, but basic methods are the same. For example:

    myfile.open("r")
    data = myfile.read()
    myfile.close()
    
    # the rest of code
    

    Note that the file has to be opened and closed correctly. Also, myfile.open() is a pSeven file method, and the opening mode parameter ("r") is required, unlike the Python’s open() builtin.

Old Versions

PythonScript was completely redesigned after the pSeven 2.0 release. New version of the block is not compatible with the old one, but pSeven kept compatibility by allowing to load workflows containing the old version of PythonScript. However, the old block in such workflows was not replaced with the new one because it could not be upgraded automatically.

The support for the old version of PythonScript was finally discontinued in pSeven 6.6. Since this version, you can still open workflows containing the old PythonScript block, but they will have to be updated manually.

Note that prior to pSeven 6.6 the old block was always kept intact if it existed in the workflow. Due to this it is possible to have an old block version in your workflow originating from pSeven 2.0 or below, even if you later edited the workflow in a newer pSeven version.

If you open such a workflow in pSeven 6.6 or above, the old version of PythonScript block will keep the embedded script, but all block ports will be removed (consequently, links to this block will also be lost). The ports and links then have to restored manually. The workflow will not run until you fix the block: on attempt to start the workflow, pSeven will issue an error message informing you of the situation. pSeven will also add a comment to the block script starting with the following:

# This block was originally created in an old version which is
# no longer supported. It could not be completely upgraded to
# the current version. The original Python script was preserved, but
# block ports were removed and will have to be re-created manually.
# You will also need to restore links to these ports.
#
# Other contents of the workflow are not affected. The workflow will
# run normally after you fix this block.

The comment will then provide instructions on updating the block and list all ports and links that were removed, so you can add them back manually. After fixing the block, you should remove the comment and the line above it (that line raises an exception in order to prevent the broken workflow from running).