🐾How to pass parameters in Step Functions workflow🐾
🤓 I often see that StepFunction’s logic of passing parameters between states is hard to understand and many people struggle with that. Let’s go step by step and you see that it’s much easier than you think.
Task state paths
You have four task state paths you can modify:
InputPath
Allows to filter input given to the step (task).ResultSelector
Allows to filter the task result, so you choose which values should be passed to the ResultPath.ResultPath
Allows you to overwrite, append or modify the State Input with the result of the task.OutputPath
The final result of the step executions which will be an input for the next step. If you don't specify an OutputPath, it will pass the entire JSON (determined by the state input, the task result, and ResultPath) to the next state.
You can use these paths to modify information passed from one state to another.
How to pass the parameters
You have your ML pipeline based on StepFunctions. You want to use CodeBuild to train your model and deploy it ti different environments using Lambda functions, for example dev environment. And you want to pass ID of commit which triggered the workflow to the Lambda function which performs deployment.
Let’s see how to pass all these parameters to the Step Functions workflow.
Our example pipeline was triggered by CodeBuild commit, so we will get all the information about the commit as our InputPath. When JSON is passed to the state, we can reference values in it using $ sign and in case we want to get value from key detail, we need to specify $.detail. If we have nested JSON structure, we can get commitID value from nested JSON in detail key by specifying $.detail.commitID.
Pretty easy, isn’t it? But what if we want to use it in subsequent states?
Then we should specify ResultPath as $.response for example, so result of our task will be added to State input JSON as key response. Next task will get this JSON as input and you will be able to use commitId or any other parameter from initial input.
{
Comment = "CI/CD for the ${var.feature_name} ML inference deployment",
StartAt = "CodeBuildTrain",
States = {
"CodeBuildTrain": {
"Type": "Task",
"Resource": "arn:aws:states:::codebuild:startBuild.sync",
"Parameters": {
"ProjectName": aws_codebuild_project.codebuild_project.name,
"EnvironmentVariablesOverride": [
{
"Name": "COMMIT_ID",
"Type": "PLAINTEXT",
"Value.$": "$.detail.commitId"
}
]
},
"ResultPath": "$.response",
"Next" : "DeployLambdaStage"
},
"DeployLambdaStage" : {
"Type" : "Task",
"Resource" : aws_lambda_function.deploy_lambda.arn,
"Parameters" : {
"env": "dev",
"commit.$": "$.detail.commitId"
},
"End" : true
}
}
}
In case you struggle with passing all the parameters correctly, you can use Data flow simulator functionality of Step Functions. It gives you ability to give state input as JSON and iterate over all the states.
Thank you for reading, let’s chat 💬
💬 Do you use Step Functions for your data and ML workflows?
💬 In case not, which orchestrator do you use and why?
💬 What struggles have you got when using Step Functions?
I love hearing from readers 🫶🏻 Please feel free to drop comments, questions, and opinions below👇🏻