Skip to main content

Executing your code

Layer allows you to run the same code remotely on the Layer clusters as you do locally. The reason why you might want to run your code remotely is to take advantage of Layer's greater computational powers than what you might have available to you locally, while not having to worry about infrastructure or orchestration.

Initialize a project

To get started with Layer, the first step is to set up a project. Enter the following command in your notebook:

import layer

# firstly make sure we're logged in to Layer
layer.login()

# tell Layer which project you will be working under, and ensure project exists remotely
layer.init("my_example_project")

The command above will make sure that the Layer backend has a project with the given name.

Sharing your dataset or model entities with others

Once you initialize your project, you can use decorators around your functions to identify which functions will generate your datasets and your models. For example, let's say you have a simple function that generates a dataset, as shown next:

import pandas as pd

def create_product_dataset():
data = [[1, "product1", 15], [2, "product2", 20], [3, "product3", 10]]
dataframe = pd.DataFrame(data, columns=["Id", "Product", "Price"])
return dataframe

df = create_product_dataset()

You will normally start by running the function above locally on your computer. At some point, you might want to share the dataset being generated with others. Annotate the function with @dataset decorator and give it a name. In the example below we name the dataset "my_products":

import pandas as pd
from layer.decorators import dataset

@dataset("my_products")
def create_product_dataset():
data = [[1, "product1", 15], [2, "product2", 20], [3, "product3", 10]]
dataframe = pd.DataFrame(data, columns=["Id", "Product", "Price"])
return dataframe

df = create_product_dataset()

If you execute the annotated function locally, you will still be executing the code in your local environment. However, in addition to running the code, the Layer SDK will see the function is decorated with @dataset. As a result, it will contact the Layer backend to store and version the dataset remotely and associate it with your project my_example_project that you initialized earlier.

The "my_products" dataset will then be retrievable and accessible by others in your organisation via the layer.get_dataset function:

import layer

df = layer.get_dataset("my_account/my_example_project/datasets/my_products").to_pandas()

Executing code in the Layer backend

In addition to easily sharing model or dataset entities, annotating a function with a Layer decorator allows you to use the Layer backend to run your code. This makes sense once you start processing large amounts of data or when your training code requires compute resources greater than what you have locally.

You can ask Layer to run your functions in its backend by using the layer.run() function and passing in a list of the decorated functions you want it to execute. For example, if we want to generate our dataset in the Layer backend, we would do the following end to end:

import layer
from layer.decorators import dataset
import pandas as pd

@dataset("my_products")
def create_product_dataset():
data = [[1, "product1", 15], [2, "product2", 20], [3, "product3", 10]]
dataframe = pd.DataFrame(data, columns=["Id", "Product", "Price"])
return dataframe

# Initialize a project so the entities we create become associated with this project
layer.init("my_example_project")

# tell layer to execute our annotated function or functions remotely
layer.run([create_product_dataset])

The above code will both trigger the execution of functions remotely and store the generated entities in Layer so that you or others can retrieve them later.