Datacake AI Nodes

AI-powered data analysis and report generation for IoT applications using OpenAI's Response API.

Overview

The Datacake AI nodes enable you to analyze IoT sensor data, generate reports, and extract insights using advanced AI models from OpenAI. Perfect for:

  • 📊 Automated data analysis and anomaly detection

  • 📝 Report generation (Markdown, HTML, PDF-ready)

  • 💡 Predictive maintenance insights

  • 📈 Time series analysis and trend detection

  • 🔍 Root cause analysis

  • 💻 Python code execution for calculations and visualizations

Configuration

Datacake AI Config

Configuration node for storing OpenAI API credentials securely.

Settings

  • Name - Optional descriptive name for this configuration

  • OpenAI API Key - Your OpenAI API key (starts with sk-...)

How to Get Your API Key

  1. Sign in or create an account

  2. Click "Create new secret key"

  3. Copy the key (starts with sk-...)

  4. Store it securely in the Datacake AI Config node

⚠️ Important: Keep your API key secure. Never share it or commit it to version control.


Datacake AI Node

Execute AI-powered analysis and report generation using OpenAI's models.

Configuration

Model Selection

Choose the AI model based on your needs:

Model
Speed
Capability
Cost (per 1M tokens)
Best For

gpt-5

Slower

Highest

$1.25 in / $10.00 out

Complex analysis, research

gpt-5-mini

Medium

High

$0.25 in / $2.00 out

Recommended for most use cases

gpt-5-nano

Fastest

Good

$0.05 in / $0.40 out

Simple tasks, quick responses

💡 The UI shows real-time cost estimates based on your max token setting.

Prompt Configuration

Prompt Source:

  • From msg.prompt - Dynamic prompts from incoming messages (flexible)

  • From Configuration - Static prompt configured in the node (reusable)

Include msg.payload:

  • When enabled, automatically includes incoming data in the prompt

  • Perfect for CSV, JSON, or text data analysis

  • Data type is automatically detected and formatted

Tools

Code Interpreter:

  • ✅ Enable for any data analysis tasks

  • Executes Python code for calculations, statistics, and visualizations

  • Can read CSV/JSON data, perform complex calculations, generate charts

  • Essential for accurate numerical analysis

Web Search:

  • 🌐 Enable for real-time information retrieval

  • Access up-to-date information from the web

  • Useful for market research, trend analysis, fact-checking

  • Configure search context size (low/medium/high)

Advanced Options

Max Output Tokens:

  • Controls maximum response length

  • Higher values = longer responses but higher cost

  • Default: 16,000 tokens (~12,000 words)

  • Typical reports use 2,000-8,000 tokens

Input Properties

Property
Type
Required
Description

msg.prompt

string

No*

The AI prompt/instruction (*required when using "From msg.prompt" mode)

msg.payload

any

No

Data to analyze (CSV, JSON, text) when "Include msg.payload" is enabled

msg.previousResponseId

string

No

Previous response ID for conversational context

msg.model

string

No

Override the configured model (e.g., "gpt-5-mini")

Output Properties

Property
Type
Description

msg.payload

string

Clean AI response text (user-friendly, ready to use)

msg.responseId

string

Response ID for follow-up questions with context

msg.openai

object

Detailed metadata about the request

Output Metadata (msg.openai)

Status Display

The node shows real-time status in the editor:

  • 🟢 Success ($0.00142) - Request completed with cost

  • 🔵 Processing... - Request in progress

  • 🔴 Error - Request failed with error message


Use Cases & Examples

1. IoT Sensor Data Analysis

Analyze CSV sensor data and identify anomalies, trends, and patterns.

Flow:

Example Output:


2. Automated Markdown Report Generation

Generate professional reports with actual calculations from data.

Flow:

Tips for Report Generation:

  • Always enable Code Interpreter for accurate calculations

  • Request "Output ONLY the markdown" to get clean results

  • Use specific format requirements (tables, sections, etc.)

  • Request charts/graphs descriptions for visualization needs


3. Conversational Data Analysis

Have a multi-turn conversation about your data with context.

First Message:

Follow-up Message:

Benefits:

  • Maintain context across multiple questions

  • Drill down into specific findings

  • Build on previous analysis

  • More efficient token usage (context is cached)


4. Predictive Maintenance Analysis

Analyze device trends and predict potential failures.

Flow:


5. Fleet Health Summary with AI Insights

Combine fleet data with AI analysis for executive reports.

Flow:


6. Energy Consumption Optimization

Analyze consumption patterns and suggest optimizations.

Flow:


7. Real-time Market Research

Combine IoT data with web research for competitive analysis.

Flow:


8. Anomaly Detection with Root Cause Analysis

Detect anomalies and automatically investigate root causes.

Flow:


Advanced Features

Conversational Context

Use msg.previousResponseId to maintain context across multiple AI interactions:

Benefits:

  • More efficient (context tokens are cached and cheaper)

  • AI understands references to previous analysis

  • Can build complex analysis step-by-step

  • Natural conversation flow

Dynamic Model Selection

Override the configured model based on task complexity:

Error Handling

Always include error handling for AI nodes:


Cost Management

Understanding Costs

The node automatically calculates costs for every request:

Token Types:

  • Input tokens: Your prompt + data

  • Cached tokens: Reused context (90% discount)

  • Output tokens: AI response

Cost Example:

Cost Optimization Tips

  1. Choose the right model:

    • Use gpt-5-nano for simple tasks ($0.40/1M output)

    • Use gpt-5-mini for most tasks ($2.00/1M output)

    • Reserve gpt-5 for complex analysis ($10.00/1M output)

  2. Optimize prompts:

    • Be specific but concise

    • Avoid sending unnecessary data

    • Use conversational context to reduce repeated information

  3. Control output length:

    • Set appropriate Max Output Tokens

    • Request concise outputs when appropriate

    • Use bullet points instead of paragraphs

  4. Use context efficiently:

    • Pass previousResponseId for follow-up questions

    • Cached tokens cost 90% less

    • Build on previous analysis instead of repeating

  5. Monitor usage:

    • Check msg.openai.cost after each request

    • Log costs to track spending over time

    • Set up alerts for high-cost requests

    • View usage at OpenAI Platform

Cost Estimation

The node shows real-time cost estimates in the configuration UI based on your Max Output Tokens setting.

Typical Costs:

  • Simple analysis: $0.001 - $0.005

  • Medium report: $0.005 - $0.020

  • Complex analysis: $0.020 - $0.100

  • With web search: +$0.005 - $0.020


Best Practices

Prompts

  1. Be specific: "Analyze temperature data and identify anomalies above 30°C" vs "Analyze data"

  2. Request structure: "Provide output as JSON with fields: issue, severity, recommendation"

  3. Set expectations: "Keep response under 200 words" or "Provide detailed analysis"

  4. Include context: Explain what the data represents and what you need

  5. Request calculations: "Calculate actual statistics" when Code Interpreter is enabled

Data Handling

  1. Format data properly: CSV or JSON for structured data

  2. Limit data size: Max ~100KB for optimal performance

  3. Preprocess when needed: Filter/aggregate large datasets before sending

  4. Include headers: CSV headers or JSON keys are important for understanding

  5. Add metadata: Include timestamps, units, device names

Performance

  1. Typical response times:

    • Simple queries: 5-15 seconds

    • With Code Interpreter: 10-30 seconds

    • With Web Search: 15-60 seconds

    • Complex analysis: 30-90 seconds

  2. Timeout: 5 minutes (600 seconds)

  3. Optimization tips:

    • Use simpler models for faster responses

    • Reduce output token limit for quicker replies

    • Avoid web search when not needed

Error Handling

  1. Always use Catch nodes for AI nodes

  2. Check for rate limits and implement backoff

  3. Validate inputs before sending to AI

  4. Log errors with context for debugging

  5. Provide fallback logic for critical flows


Troubleshooting

Common Issues

"Missing API key" error

  • Ensure Datacake AI Config node has valid OpenAI API key

  • Verify key starts with sk-

  • Check that key is still valid at OpenAI Platform

"Rate limit exceeded" error

  • You've exceeded OpenAI API rate limits

  • Wait and retry

  • Consider upgrading your OpenAI plan

  • Implement request throttling in your flows

"Quota exceeded" error

  • Your OpenAI account has insufficient credits

  • Add credits at OpenAI Platform

  • Check your usage limits

Request timeout

  • Query is too complex or data too large

  • Try with smaller dataset

  • Reduce Max Output Tokens

  • Simplify the prompt

  • Use a faster model (gpt-5-nano)

Unexpected response format

  • Check msg.openai.fullResponse for debugging

  • AI response format can vary

  • Request specific format in prompt

  • Add response format examples in prompt

Cost higher than expected

  • Check msg.openai.cost to see actual costs

  • Large data inputs increase input tokens

  • Long responses increase output tokens

  • Web search adds cost

  • Code execution adds compute time


Resources


Pricing Reference

Model
Input (per 1M tokens)
Cached Input (per 1M tokens)
Output (per 1M tokens)

gpt-5

$1.25

$0.125

$10.00

gpt-5-mini

$0.25

$0.025

$2.00

gpt-5-nano

$0.05

$0.005

$0.40

Note: Cached tokens (from context reuse) cost 90% less than regular input tokens.


Example Prompts

Data Analysis

Report Generation

Predictive Maintenance

Optimization Recommendations

Last updated

Was this helpful?