diff --git a/skills/code-interpreter/SKILL.md b/skills/code-interpreter/SKILL.md new file mode 100644 index 0000000..01e6d9c --- /dev/null +++ b/skills/code-interpreter/SKILL.md @@ -0,0 +1,150 @@ +--- +name: code-interpreter +description: Local Python code execution for calculations, tabular data inspection, CSV/JSON processing, simple plotting, text transformation, quick experiments, and reproducible analysis inside the OpenClaw workspace. Use when the user wants ChatGPT-style code interpreter behavior locally: run Python, analyze files, compute exact answers, transform data, inspect tables, or generate output files/artifacts. Prefer this for low-risk local analysis; do not use it for untrusted code, secrets handling, privileged actions, or network-dependent tasks. +--- + +# Code Interpreter + +Run local Python code through the bundled runner. + +## Safety boundary + +This is **local execution**, not a hardened container. Treat it as a convenience tool for trusted, low-risk tasks. + +Always: +- Keep work inside the OpenClaw workspace when possible. +- Prefer reading/writing files under the current task directory or an explicit artifact directory. +- Keep timeouts short by default. +- Avoid network access unless the user explicitly asks and the task truly needs it. +- Do not execute untrusted code copied from the web or other people. +- Do not expose secrets, tokens, SSH keys, browser cookies, or system files to the script. + +Do not use this skill for: +- system administration +- package installation loops +- long-running servers +- privileged operations +- destructive file changes outside the workspace +- executing arbitrary third-party code verbatim + +## Runner + +Run from the OpenClaw workspace: + +```bash +python3 {baseDir}/scripts/run_code.py --code 'print(2 + 2)' +``` + +Or pass a script file: + +```bash +python3 {baseDir}/scripts/run_code.py --file path/to/script.py +``` + +Or pipe code via stdin: + +```bash +cat my_script.py | python3 {baseDir}/scripts/run_code.py --stdin +``` + +## Useful options + +```bash +# set timeout seconds (default 20) +python3 {baseDir}/scripts/run_code.py --code '...' --timeout 10 + +# run from a specific working directory inside workspace +python3 {baseDir}/scripts/run_code.py --file script.py --cwd /home/selig/.openclaw/workspace/project + +# keep outputs in a known artifact directory inside workspace +python3 {baseDir}/scripts/run_code.py --file script.py --artifact-dir /home/selig/.openclaw/workspace/.tmp/my-analysis + +# save full stdout / stderr +python3 {baseDir}/scripts/run_code.py --code '...' --stdout-file out.txt --stderr-file err.txt +``` + +## Built-in environment + +The runner uses the dedicated interpreter at: + +- `/home/selig/.openclaw/workspace/.venv-code-interpreter/bin/python` (use the venv path directly; do not resolve the symlink to system Python) + +This keeps plotting/data-analysis dependencies stable without touching the system Python. + +The runner exposes these variables to the script: + +- `OPENCLAW_WORKSPACE` +- `CODE_INTERPRETER_RUN_DIR` +- `CODE_INTERPRETER_ARTIFACT_DIR` + +It also writes a helper file in the run directory: + +```python +from ci_helpers import save_text, save_json +``` + +Use those helpers to save artifacts into `CODE_INTERPRETER_ARTIFACT_DIR`. + +## V4 automatic data analysis + +For automatic profiling/report generation from a local data file, use: + +- `scripts/analyze_data.py` +- Reference: `references/v4-usage.md` + +This flow is ideal when the user wants a fast "analyze this CSV/JSON/Excel and give me a report + plots" result. + +## Output + +The runner prints compact JSON: + +```json +{ + "ok": true, + "exitCode": 0, + "timeout": false, + "runDir": "...", + "artifactDir": "...", + "packageStatus": {"pandas": true, "numpy": true, "matplotlib": false}, + "artifacts": [{"path": "...", "bytes": 123}], + "stdout": "...", + "stderr": "..." +} +``` + +## Workflow + +1. Decide whether the task is a good fit for local trusted execution. +2. Write the smallest script that solves the problem. +3. Use `--artifact-dir` when the user may want generated files preserved. +4. Run with a short timeout. +5. Inspect `stdout`, `stderr`, and `artifacts`. +6. If producing files, mention their exact paths in the reply. + +## Patterns + +### Exact calculation +Use a one-liner with `--code`. + +### File analysis +Read input files from workspace, then write summaries/derived files back to `artifactDir`. + +### Automatic report bundle +When the user wants a quick profiling pass, run `scripts/analyze_data.py` against the file and return the generated `summary.json`, `report.md`, `preview.csv`, and any PNG plots. + +### Table inspection +Prefer pandas when available; otherwise fall back to csv/json stdlib. + +### Plotting +If `matplotlib` is available, write PNG files to `artifactDir`. Use a forced CJK font strategy for Chinese charts. The bundled default is Google Noto Sans CJK TC under `assets/fonts/` when present, then system fallbacks. Apply the chosen font not only via rcParams but also directly to titles, axis labels, tick labels, and legend text through FontProperties. This avoids tofu/garbled Chinese and suppresses missing-glyph warnings reliably. If plotting is unavailable, continue with tabular/text output. + +### Reusable logic +Write a small `.py` file in the current task area, run with `--file`, then keep it if it may be reused. + +## Notes + +- The runner launches `python3 -B` with a minimal environment. +- It creates an isolated temp run directory under `workspace/.tmp/code-interpreter-runs/`. +- `stdout` / `stderr` are truncated in the JSON preview if very large; save to files when needed. +- `MPLBACKEND=Agg` is set so headless plotting works when matplotlib is installed. +- If a task needs stronger isolation than this local runner provides, do not force it—use a real sandbox/container approach instead. diff --git a/skills/code-interpreter/assets/fonts/NotoSansCJKtc-Regular.otf b/skills/code-interpreter/assets/fonts/NotoSansCJKtc-Regular.otf new file mode 100644 index 0000000..f9376ba Binary files /dev/null and b/skills/code-interpreter/assets/fonts/NotoSansCJKtc-Regular.otf differ diff --git a/skills/code-interpreter/references/v4-usage.md b/skills/code-interpreter/references/v4-usage.md new file mode 100644 index 0000000..8d3189f --- /dev/null +++ b/skills/code-interpreter/references/v4-usage.md @@ -0,0 +1,29 @@ +# V4 Usage + +## Purpose + +Generate an automatic data analysis bundle from a local data file. + +## Command + +```bash +/home/selig/.openclaw/workspace/.venv-code-interpreter/bin/python \ + /home/selig/.openclaw/workspace/skills/code-interpreter/scripts/analyze_data.py \ + /path/to/input.csv \ + --artifact-dir /home/selig/.openclaw/workspace/.tmp/my-analysis +``` + +## Outputs + +- `summary.json` — machine-readable profile +- `report.md` — human-readable summary +- `preview.csv` — first 50 rows after parsing +- `*.png` — generated plots when matplotlib is available + +## Supported inputs + +- `.csv` +- `.tsv` +- `.json` +- `.xlsx` +- `.xls` diff --git a/skills/code-interpreter/scripts/__pycache__/analyze_data.cpython-312.pyc b/skills/code-interpreter/scripts/__pycache__/analyze_data.cpython-312.pyc new file mode 100644 index 0000000..9754d2a Binary files /dev/null and b/skills/code-interpreter/scripts/__pycache__/analyze_data.cpython-312.pyc differ diff --git a/skills/code-interpreter/scripts/__pycache__/run_code.cpython-312.pyc b/skills/code-interpreter/scripts/__pycache__/run_code.cpython-312.pyc new file mode 100644 index 0000000..e03f28e Binary files /dev/null and b/skills/code-interpreter/scripts/__pycache__/run_code.cpython-312.pyc differ diff --git a/skills/code-interpreter/scripts/analyze_data.py b/skills/code-interpreter/scripts/analyze_data.py new file mode 100644 index 0000000..ec94f3b --- /dev/null +++ b/skills/code-interpreter/scripts/analyze_data.py @@ -0,0 +1,285 @@ +#!/usr/bin/env python3 +import argparse +import json +import math +import os +from pathlib import Path + +try: + import pandas as pd +except ImportError: + raise SystemExit( + 'pandas is required. Run with the code-interpreter venv:\n' + ' ~/.openclaw/workspace/.venv-code-interpreter/bin/python analyze_data.py ...' + ) + +try: + import matplotlib + import matplotlib.pyplot as plt + HAS_MPL = True +except Exception: + HAS_MPL = False + +ZH_FONT_CANDIDATES = [ + '/home/selig/.openclaw/workspace/skills/code-interpreter/assets/fonts/NotoSansCJKtc-Regular.otf', + '/usr/share/fonts/truetype/droid/DroidSansFallbackFull.ttf', +] + + +def configure_matplotlib_fonts() -> tuple[str | None, object | None]: + if not HAS_MPL: + return None, None + chosen = None + chosen_prop = None + for path in ZH_FONT_CANDIDATES: + if Path(path).exists(): + try: + from matplotlib import font_manager + font_manager.fontManager.addfont(path) + font_prop = font_manager.FontProperties(fname=path) + font_name = font_prop.get_name() + matplotlib.rcParams['font.family'] = [font_name] + matplotlib.rcParams['axes.unicode_minus'] = False + chosen = font_name + chosen_prop = font_prop + break + except Exception: + continue + return chosen, chosen_prop + + +def apply_font(ax, font_prop) -> None: + if not font_prop: + return + title = ax.title + if title: + title.set_fontproperties(font_prop) + ax.xaxis.label.set_fontproperties(font_prop) + ax.yaxis.label.set_fontproperties(font_prop) + for label in ax.get_xticklabels(): + label.set_fontproperties(font_prop) + for label in ax.get_yticklabels(): + label.set_fontproperties(font_prop) + legend = ax.get_legend() + if legend: + for text in legend.get_texts(): + text.set_fontproperties(font_prop) + legend.get_title().set_fontproperties(font_prop) + + +def detect_format(path: Path) -> str: + ext = path.suffix.lower() + if ext in {'.csv', '.tsv', '.txt'}: + return 'delimited' + if ext == '.json': + return 'json' + if ext in {'.xlsx', '.xls'}: + return 'excel' + raise SystemExit(f'Unsupported file type: {ext}') + + +def load_df(path: Path) -> pd.DataFrame: + fmt = detect_format(path) + if fmt == 'delimited': + sep = '\t' if path.suffix.lower() == '.tsv' else ',' + return pd.read_csv(path, sep=sep) + if fmt == 'json': + try: + return pd.read_json(path) + except ValueError: + return pd.DataFrame(json.loads(path.read_text(encoding='utf-8'))) + if fmt == 'excel': + return pd.read_excel(path) + raise SystemExit('Unsupported format') + + +def safe_name(s: str) -> str: + keep = [] + for ch in s: + if ch.isalnum() or ch in ('-', '_'): + keep.append(ch) + elif ch in (' ', '/'): + keep.append('_') + out = ''.join(keep).strip('_') + return out[:80] or 'column' + + +def series_stats(s: pd.Series) -> dict: + non_null = s.dropna() + result = { + 'dtype': str(s.dtype), + 'nonNull': int(non_null.shape[0]), + 'nulls': int(s.isna().sum()), + 'unique': int(non_null.nunique()) if len(non_null) else 0, + } + if pd.api.types.is_numeric_dtype(s): + result.update({ + 'min': None if non_null.empty else float(non_null.min()), + 'max': None if non_null.empty else float(non_null.max()), + 'mean': None if non_null.empty else float(non_null.mean()), + 'sum': None if non_null.empty else float(non_null.sum()), + }) + else: + top = non_null.astype(str).value_counts().head(5) + result['topValues'] = [{ + 'value': str(idx), + 'count': int(val), + } for idx, val in top.items()] + return result + + +def maybe_parse_dates(df: pd.DataFrame) -> tuple[pd.DataFrame, list[str]]: + parsed = [] + out = df.copy() + for col in out.columns: + if out[col].dtype == 'object': + sample = out[col].dropna().astype(str).head(20) + if sample.empty: + continue + parsed_col = pd.to_datetime(out[col], errors='coerce') + success_ratio = float(parsed_col.notna().mean()) if len(out[col]) else 0.0 + if success_ratio >= 0.6: + out[col] = parsed_col + parsed.append(str(col)) + return out, parsed + + +def write_report(df: pd.DataFrame, summary: dict, out_dir: Path) -> Path: + lines = [] + lines.append('# Data Analysis Report') + lines.append('') + lines.append(f"- Source: `{summary['source']}`") + lines.append(f"- Rows: **{summary['rows']}**") + lines.append(f"- Columns: **{summary['columns']}**") + lines.append(f"- Generated plots: **{len(summary['plots'])}**") + if summary['parsedDateColumns']: + lines.append(f"- Parsed date columns: {', '.join(summary['parsedDateColumns'])}") + lines.append('') + lines.append('## Columns') + lines.append('') + for name, meta in summary['columnProfiles'].items(): + lines.append(f"### {name}") + lines.append(f"- dtype: `{meta['dtype']}`") + lines.append(f"- non-null: {meta['nonNull']}") + lines.append(f"- nulls: {meta['nulls']}") + lines.append(f"- unique: {meta['unique']}") + if 'mean' in meta: + lines.append(f"- min / max: {meta['min']} / {meta['max']}") + lines.append(f"- mean / sum: {meta['mean']} / {meta['sum']}") + elif meta.get('topValues'): + preview = ', '.join([f"{x['value']} ({x['count']})" for x in meta['topValues'][:5]]) + lines.append(f"- top values: {preview}") + lines.append('') + report = out_dir / 'report.md' + report.write_text('\n'.join(lines).strip() + '\n', encoding='utf-8') + return report + + +def generate_plots(df: pd.DataFrame, out_dir: Path, font_prop=None) -> list[str]: + if not HAS_MPL: + return [] + plots = [] + numeric_cols = [c for c in df.columns if pd.api.types.is_numeric_dtype(df[c])] + date_cols = [c for c in df.columns if pd.api.types.is_datetime64_any_dtype(df[c])] + cat_cols = [c for c in df.columns if not pd.api.types.is_numeric_dtype(df[c]) and not pd.api.types.is_datetime64_any_dtype(df[c])] + + if numeric_cols: + col = numeric_cols[0] + plt.figure(figsize=(7, 4)) + bins = min(20, max(5, int(math.sqrt(max(1, df[col].dropna().shape[0]))))) + df[col].dropna().hist(bins=bins) + plt.title(f'Histogram of {col}', fontproperties=font_prop) + plt.xlabel(str(col), fontproperties=font_prop) + plt.ylabel('Count', fontproperties=font_prop) + apply_font(plt.gca(), font_prop) + path = out_dir / f'hist_{safe_name(str(col))}.png' + plt.tight_layout() + plt.savefig(path, dpi=160) + plt.close() + plots.append(str(path)) + + if cat_cols and numeric_cols: + cat, num = cat_cols[0], numeric_cols[0] + grp = df.groupby(cat, dropna=False)[num].sum().sort_values(ascending=False).head(12) + if not grp.empty: + plt.figure(figsize=(8, 4.5)) + grp.plot(kind='bar') + plt.title(f'{num} by {cat}', fontproperties=font_prop) + plt.xlabel(str(cat), fontproperties=font_prop) + plt.ylabel(f'Sum of {num}', fontproperties=font_prop) + apply_font(plt.gca(), font_prop) + plt.tight_layout() + path = out_dir / f'bar_{safe_name(str(num))}_by_{safe_name(str(cat))}.png' + plt.savefig(path, dpi=160) + plt.close() + plots.append(str(path)) + + if date_cols and numeric_cols: + date_col, num = date_cols[0], numeric_cols[0] + grp = df[[date_col, num]].dropna().sort_values(date_col) + if not grp.empty: + plt.figure(figsize=(8, 4.5)) + plt.plot(grp[date_col], grp[num], marker='o') + plt.title(f'{num} over time', fontproperties=font_prop) + plt.xlabel(str(date_col), fontproperties=font_prop) + plt.ylabel(str(num), fontproperties=font_prop) + apply_font(plt.gca(), font_prop) + plt.tight_layout() + path = out_dir / f'line_{safe_name(str(num))}_over_time.png' + plt.savefig(path, dpi=160) + plt.close() + plots.append(str(path)) + + return plots + + +def main() -> int: + parser = argparse.ArgumentParser(description='Automatic data analysis report generator') + parser.add_argument('input', help='Input data file (csv/json/xlsx)') + parser.add_argument('--artifact-dir', required=True, help='Output artifact directory') + args = parser.parse_args() + + input_path = Path(args.input).expanduser().resolve() + artifact_dir = Path(args.artifact_dir).expanduser().resolve() + artifact_dir.mkdir(parents=True, exist_ok=True) + + df = load_df(input_path) + original_columns = [str(c) for c in df.columns] + df, parsed_dates = maybe_parse_dates(df) + chosen_font, chosen_font_prop = configure_matplotlib_fonts() + + preview_path = artifact_dir / 'preview.csv' + df.head(50).to_csv(preview_path, index=False) + + summary = { + 'source': str(input_path), + 'rows': int(df.shape[0]), + 'columns': int(df.shape[1]), + 'columnNames': original_columns, + 'parsedDateColumns': parsed_dates, + 'columnProfiles': {str(c): series_stats(df[c]) for c in df.columns}, + 'plots': [], + 'plotFont': chosen_font, + } + + summary['plots'] = generate_plots(df, artifact_dir, chosen_font_prop) + + summary_path = artifact_dir / 'summary.json' + summary_path.write_text(json.dumps(summary, ensure_ascii=False, indent=2), encoding='utf-8') + report_path = write_report(df, summary, artifact_dir) + + result = { + 'ok': True, + 'input': str(input_path), + 'artifactDir': str(artifact_dir), + 'summary': str(summary_path), + 'report': str(report_path), + 'preview': str(preview_path), + 'plots': summary['plots'], + } + print(json.dumps(result, ensure_ascii=False, indent=2)) + return 0 + + +if __name__ == '__main__': + raise SystemExit(main()) diff --git a/skills/code-interpreter/scripts/run_code.py b/skills/code-interpreter/scripts/run_code.py new file mode 100644 index 0000000..03d1cd1 --- /dev/null +++ b/skills/code-interpreter/scripts/run_code.py @@ -0,0 +1,241 @@ +#!/usr/bin/env python3 +import argparse +import importlib.util +import json +import os +import pathlib +import shutil +import subprocess +import sys +import tempfile +import time +from typing import Optional + +WORKSPACE = pathlib.Path('/home/selig/.openclaw/workspace').resolve() +RUNS_DIR = WORKSPACE / '.tmp' / 'code-interpreter-runs' +MAX_PREVIEW = 12000 +ARTIFACT_SCAN_LIMIT = 100 +PACKAGE_PROBES = ['pandas', 'numpy', 'matplotlib'] +PYTHON_BIN = str(WORKSPACE / '.venv-code-interpreter' / 'bin' / 'python') + + +def current_python_paths(run_dir_path: pathlib.Path) -> str: + """Build PYTHONPATH: run_dir (for ci_helpers) only. + Venv site-packages are already on sys.path when using PYTHON_BIN.""" + return str(run_dir_path) + + +def read_code(args: argparse.Namespace) -> str: + sources = [bool(args.code), bool(args.file), bool(args.stdin)] + if sum(sources) != 1: + raise SystemExit('Provide exactly one of --code, --file, or --stdin') + if args.code: + return args.code + if args.file: + return pathlib.Path(args.file).read_text(encoding='utf-8') + return sys.stdin.read() + + +def ensure_within_workspace(path_str: Optional[str], must_exist: bool = True) -> pathlib.Path: + if not path_str: + return WORKSPACE + p = pathlib.Path(path_str).expanduser().resolve() + if p != WORKSPACE and WORKSPACE not in p.parents: + raise SystemExit(f'Path must stay inside workspace: {WORKSPACE}') + if must_exist and (not p.exists() or not p.is_dir()): + raise SystemExit(f'Path not found or not a directory: {p}') + return p + + +def ensure_output_path(path_str: Optional[str]) -> Optional[pathlib.Path]: + if not path_str: + return None + p = pathlib.Path(path_str).expanduser().resolve() + p.parent.mkdir(parents=True, exist_ok=True) + return p + + +def write_text(path_str: Optional[str], text: str) -> None: + p = ensure_output_path(path_str) + if not p: + return + p.write_text(text, encoding='utf-8') + + +def truncate(text: str) -> str: + if len(text) <= MAX_PREVIEW: + return text + extra = len(text) - MAX_PREVIEW + return text[:MAX_PREVIEW] + f'\n...[truncated {extra} chars]' + + +def package_status() -> dict: + out: dict[str, bool] = {} + for name in PACKAGE_PROBES: + proc = subprocess.run( + [PYTHON_BIN, '-c', f"import importlib.util; print('1' if importlib.util.find_spec('{name}') else '0')"], + capture_output=True, + text=True, + encoding='utf-8', + errors='replace', + ) + out[name] = proc.stdout.strip() == '1' + return out + + +def rel_to(path: pathlib.Path, base: pathlib.Path) -> str: + try: + return str(path.relative_to(base)) + except Exception: + return str(path) + + +def scan_artifacts(base_dir: pathlib.Path, root_label: str) -> list[dict]: + if not base_dir.exists(): + return [] + items: list[dict] = [] + for p in sorted(base_dir.rglob('*')): + if len(items) >= ARTIFACT_SCAN_LIMIT: + break + if p.is_file(): + try: + size = p.stat().st_size + except Exception: + size = None + items.append({ + 'root': root_label, + 'path': str(p), + 'relative': rel_to(p, base_dir), + 'bytes': size, + }) + return items + + +def write_helper(run_dir_path: pathlib.Path, artifact_dir: pathlib.Path) -> None: + helper = run_dir_path / 'ci_helpers.py' + helper.write_text( + """ +from pathlib import Path +import json +import os + +WORKSPACE = Path(os.environ['OPENCLAW_WORKSPACE']) +RUN_DIR = Path(os.environ['CODE_INTERPRETER_RUN_DIR']) +ARTIFACT_DIR = Path(os.environ['CODE_INTERPRETER_ARTIFACT_DIR']) + + +def save_text(name: str, text: str) -> str: + path = ARTIFACT_DIR / name + path.parent.mkdir(parents=True, exist_ok=True) + path.write_text(text, encoding='utf-8') + return str(path) + + +def save_json(name: str, data) -> str: + path = ARTIFACT_DIR / name + path.parent.mkdir(parents=True, exist_ok=True) + path.write_text(json.dumps(data, ensure_ascii=False, indent=2), encoding='utf-8') + return str(path) +""".lstrip(), + encoding='utf-8', + ) + + +def main() -> int: + parser = argparse.ArgumentParser(description='Local Python runner for OpenClaw code-interpreter skill') + parser.add_argument('--code', help='Python code to execute') + parser.add_argument('--file', help='Path to a Python file to execute') + parser.add_argument('--stdin', action='store_true', help='Read Python code from stdin') + parser.add_argument('--cwd', help='Working directory inside workspace') + parser.add_argument('--artifact-dir', help='Artifact directory inside workspace to keep outputs') + parser.add_argument('--timeout', type=int, default=20, help='Timeout seconds (default: 20)') + parser.add_argument('--stdout-file', help='Optional file path to save full stdout') + parser.add_argument('--stderr-file', help='Optional file path to save full stderr') + parser.add_argument('--keep-run-dir', action='store_true', help='Keep generated temp run directory even on success') + args = parser.parse_args() + + code = read_code(args) + cwd = ensure_within_workspace(args.cwd) + RUNS_DIR.mkdir(parents=True, exist_ok=True) + + run_dir_path = pathlib.Path(tempfile.mkdtemp(prefix='run-', dir=str(RUNS_DIR))).resolve() + artifact_dir = ensure_within_workspace(args.artifact_dir, must_exist=False) if args.artifact_dir else (run_dir_path / 'artifacts') + artifact_dir.mkdir(parents=True, exist_ok=True) + + script_path = run_dir_path / 'main.py' + script_path.write_text(code, encoding='utf-8') + write_helper(run_dir_path, artifact_dir) + + env = { + 'PATH': os.environ.get('PATH', '/usr/bin:/bin'), + 'HOME': str(run_dir_path), + 'PYTHONPATH': current_python_paths(run_dir_path), + 'PYTHONIOENCODING': 'utf-8', + 'PYTHONUNBUFFERED': '1', + 'OPENCLAW_WORKSPACE': str(WORKSPACE), + 'CODE_INTERPRETER_RUN_DIR': str(run_dir_path), + 'CODE_INTERPRETER_ARTIFACT_DIR': str(artifact_dir), + 'MPLBACKEND': 'Agg', + } + + started = time.time() + timed_out = False + exit_code = None + stdout = '' + stderr = '' + + try: + proc = subprocess.run( + [PYTHON_BIN, '-B', str(script_path)], + cwd=str(cwd), + env=env, + capture_output=True, + text=True, + encoding='utf-8', + errors='replace', + timeout=max(1, args.timeout), + ) + exit_code = proc.returncode + stdout = proc.stdout + stderr = proc.stderr + except subprocess.TimeoutExpired as exc: + timed_out = True + exit_code = 124 + raw_out = exc.stdout or '' + raw_err = exc.stderr or '' + stdout = raw_out if isinstance(raw_out, str) else raw_out.decode('utf-8', errors='replace') + stderr = (raw_err if isinstance(raw_err, str) else raw_err.decode('utf-8', errors='replace')) + f'\nExecution timed out after {args.timeout}s.' + + duration = round(time.time() - started, 3) + + write_text(args.stdout_file, stdout) + write_text(args.stderr_file, stderr) + + artifacts = scan_artifacts(artifact_dir, 'artifactDir') + if artifact_dir != run_dir_path: + artifacts.extend(scan_artifacts(run_dir_path / 'artifacts', 'runArtifacts')) + + result = { + 'ok': (exit_code == 0 and not timed_out), + 'exitCode': exit_code, + 'timeout': timed_out, + 'durationSec': duration, + 'cwd': str(cwd), + 'runDir': str(run_dir_path), + 'artifactDir': str(artifact_dir), + 'packageStatus': package_status(), + 'artifacts': artifacts, + 'stdout': truncate(stdout), + 'stderr': truncate(stderr), + } + + print(json.dumps(result, ensure_ascii=False, indent=2)) + + if not args.keep_run_dir and result['ok'] and artifact_dir != run_dir_path: + shutil.rmtree(run_dir_path, ignore_errors=True) + + return 0 if result['ok'] else 1 + + +if __name__ == '__main__': + raise SystemExit(main()) diff --git a/skills/kokoro-tts b/skills/kokoro-tts new file mode 120000 index 0000000..0d32add --- /dev/null +++ b/skills/kokoro-tts @@ -0,0 +1 @@ +/home/selig/.openclaw/workspace/skills/kokoro-tts \ No newline at end of file diff --git a/skills/remotion-best-practices b/skills/remotion-best-practices new file mode 120000 index 0000000..d8cb165 --- /dev/null +++ b/skills/remotion-best-practices @@ -0,0 +1 @@ +/home/selig/.agents/skills/remotion-best-practices \ No newline at end of file diff --git a/skills/research-to-paper-slides/SKILL.md b/skills/research-to-paper-slides/SKILL.md new file mode 100644 index 0000000..4126184 --- /dev/null +++ b/skills/research-to-paper-slides/SKILL.md @@ -0,0 +1,125 @@ +--- +name: research-to-paper-slides +description: Turn local analysis outputs into publication-style drafts and presentation materials. Use when the user already has research/data-analysis artifacts such as summary.json, report.md, preview.csv, plots, or code-interpreter output and wants a complete first-pass paper draft, slide outline, speaker notes, or HTML deck. Especially useful after using the code-interpreter skill on small-to-medium datasets and the next step is to package findings into a paper, report, pitch deck, class slides, or meeting presentation. +--- + +# research-to-paper-slides + +Generate a complete first-pass writing bundle from analysis artifacts. + +## Inputs + +Best input bundle: +- `summary.json` +- `report.md` +- one or more plot PNG files + +Optional: +- `preview.csv` +- raw CSV/JSON/XLSX path for source naming only +- extra notes from the user (audience, tone, purpose) + +## Levels + +Choose how far the workflow should go: + +- `--level v2` — **基礎交付版** + - 輸出:`paper.md`、`slides.md`、`speaker-notes.md`、`deck.html` + - 適合:快速草稿、先出第一版內容 + - 不包含:`insights.md`、逐圖解讀頁、正式 deck 視覺強化 + +- `--level v3` — **洞察強化版** + - 包含 `v2` 全部內容 + - 另外增加:`insights.md`、每張圖各一頁解讀、speaker notes 逐圖講稿 + - 適合:內部討論、研究整理、需要把圖表講清楚 + +- `--level v4` — **正式交付版** + - 包含 `v3` 全部內容 + - 另外增加:更正式的 deck 視覺版面、PDF-ready 工作流 + - 適合:正式簡報、提案、對外展示 + +## Modes + +- `academic` — 論文/研究報告/研討會簡報 +- `business` — 內部決策/管理匯報/策略說明 +- `pitch` — 提案/募資/對外說服型簡報 + +## Outputs + +Depending on `--level`, the generator creates: +- `paper.md` — structured paper/report draft +- `slides.md` — slide-by-slide content outline +- `speaker-notes.md` — presenter script notes +- `insights.md` — key insights + plot interpretations (`v3` / `v4`) +- `deck.html` — printable deck HTML +- `bundle.json` — machine-readable manifest with `level` and `levelNote` + +Optional local export: +- `export_pdf.py` — export `deck.html` to PDF via local headless Chromium + +## Workflow + +1. Point the generator at an analysis artifact directory. +2. Pass `--mode` for audience style. +3. Pass `--level` for workflow depth. +4. Review the generated markdown/html. +5. If needed, refine wording or structure. +6. If using `v4`, export `deck.html` to PDF. + +## Commands + +### V2 — 基礎交付版 + +```bash +python3 {baseDir}/scripts/generate_bundle.py \ + --analysis-dir /path/to/analysis/out \ + --output-dir /path/to/paper-slides-out \ + --title "研究標題" \ + --audience "投資人" \ + --purpose "簡報" \ + --mode business \ + --level v2 +``` + +### V3 — 洞察強化版 + +```bash +python3 {baseDir}/scripts/generate_bundle.py \ + --analysis-dir /path/to/analysis/out \ + --output-dir /path/to/paper-slides-out \ + --title "研究標題" \ + --audience "研究者" \ + --purpose "研究整理" \ + --mode academic \ + --level v3 +``` + +### V4 — 正式交付版 + +```bash +python3 {baseDir}/scripts/generate_bundle.py \ + --analysis-dir /path/to/analysis/out \ + --output-dir /path/to/paper-slides-out \ + --title "研究標題" \ + --audience "投資人" \ + --purpose "募資簡報" \ + --mode pitch \ + --level v4 +``` + +## PDF export + +If local Chromium is available, try: + +```bash +python3 {baseDir}/scripts/export_pdf.py \ + --html /path/to/deck.html \ + --pdf /path/to/deck.pdf +``` + +## Notes + +- Prefer this skill after `code-interpreter` or any workflow that already produced plots and structured summaries. +- Keep this as a first-pass drafting tool; the output is meant to be edited, not treated as final publication-ready text. +- On this workstation, Chromium CLI `--print-to-pdf` may still fail with host-specific permission/runtime quirks even when directories are writable. +- When the user wants a PDF, try `export_pdf.py` first; if it fails, immediately fall back to OpenClaw browser PDF export on a locally served `deck.html`. diff --git a/skills/research-to-paper-slides/references/pdf-notes.md b/skills/research-to-paper-slides/references/pdf-notes.md new file mode 100644 index 0000000..bb90589 --- /dev/null +++ b/skills/research-to-paper-slides/references/pdf-notes.md @@ -0,0 +1,14 @@ +# PDF Notes + +## Current recommended path + +1. Generate `deck.html` with this skill. +2. Open `deck.html` in the browser. +3. Export to PDF with browser print/PDF flow. +4. If small textual tweaks are needed after PDF export, use the installed `nano-pdf` skill. + +## Why this path + +- HTML is easier to iterate than direct PDF generation. +- Existing plot PNG files can be embedded cleanly. +- Browser PDF export preserves layout reliably for first-pass decks. diff --git a/skills/research-to-paper-slides/scripts/__pycache__/export_pdf.cpython-312.pyc b/skills/research-to-paper-slides/scripts/__pycache__/export_pdf.cpython-312.pyc new file mode 100644 index 0000000..3e1975a Binary files /dev/null and b/skills/research-to-paper-slides/scripts/__pycache__/export_pdf.cpython-312.pyc differ diff --git a/skills/research-to-paper-slides/scripts/__pycache__/generate_bundle.cpython-312.pyc b/skills/research-to-paper-slides/scripts/__pycache__/generate_bundle.cpython-312.pyc new file mode 100644 index 0000000..11716fb Binary files /dev/null and b/skills/research-to-paper-slides/scripts/__pycache__/generate_bundle.cpython-312.pyc differ diff --git a/skills/research-to-paper-slides/scripts/export_pdf.py b/skills/research-to-paper-slides/scripts/export_pdf.py new file mode 100644 index 0000000..0d95095 --- /dev/null +++ b/skills/research-to-paper-slides/scripts/export_pdf.py @@ -0,0 +1,53 @@ +#!/usr/bin/env python3 +import argparse +import glob +import os +import shutil +import subprocess +import tempfile +from pathlib import Path + + +def find_browser() -> str: + # Playwright Chromium (most reliable on this workstation) + for pw in sorted(glob.glob(os.path.expanduser('~/.cache/ms-playwright/chromium-*/chrome-linux/chrome')), reverse=True): + if os.access(pw, os.X_OK): + return pw + for name in ['chromium-browser', 'chromium', 'google-chrome', 'google-chrome-stable']: + path = shutil.which(name) + if path: + return path + raise SystemExit('No supported browser found for PDF export. Install Playwright Chromium: npx playwright install chromium') + + +def main() -> int: + parser = argparse.ArgumentParser(description='Export deck HTML to PDF using headless Chromium') + parser.add_argument('--html', required=True) + parser.add_argument('--pdf', required=True) + args = parser.parse_args() + + html_path = Path(args.html).expanduser().resolve() + pdf_path = Path(args.pdf).expanduser().resolve() + pdf_path.parent.mkdir(parents=True, exist_ok=True) + + if not html_path.exists(): + raise SystemExit(f'Missing HTML input: {html_path}') + + browser = find_browser() + with tempfile.TemporaryDirectory(prefix='rtps-chromium-') as profile_dir: + cmd = [ + browser, + '--headless', + '--disable-gpu', + '--no-sandbox', + f'--user-data-dir={profile_dir}', + f'--print-to-pdf={pdf_path}', + html_path.as_uri(), + ] + subprocess.run(cmd, check=True) + print(str(pdf_path)) + return 0 + + +if __name__ == '__main__': + raise SystemExit(main()) diff --git a/skills/research-to-paper-slides/scripts/generate_bundle.py b/skills/research-to-paper-slides/scripts/generate_bundle.py new file mode 100644 index 0000000..1af135e --- /dev/null +++ b/skills/research-to-paper-slides/scripts/generate_bundle.py @@ -0,0 +1,498 @@ +#!/usr/bin/env python3 +import argparse +import json +import html +import shutil +from pathlib import Path +from typing import Any + + +MODES = {'academic', 'business', 'pitch'} +LEVELS = {'v2', 'v3', 'v4'} +LEVEL_NOTES = { + 'v2': '基礎交付版:paper/slides/speaker-notes/deck', + 'v3': '洞察強化版:v2 + insights + 每張圖逐頁解讀', + 'v4': '正式交付版:v3 + 更正式 deck 視覺 + PDF-ready 工作流', +} + + +def read_json(path: Path) -> dict[str, Any]: + return json.loads(path.read_text(encoding='utf-8')) + + +def read_text(path: Path) -> str: + return path.read_text(encoding='utf-8') + + +def find_plots(analysis_dir: Path) -> list[Path]: + return sorted([p for p in analysis_dir.glob('*.png') if p.is_file()]) + + +def build_key_findings(summary: dict[str, Any]) -> list[str]: + findings: list[str] = [] + for name, meta in summary.get('columnProfiles', {}).items(): + if 'mean' in meta and meta.get('mean') is not None: + findings.append(f"欄位「{name}」平均值約為 {meta['mean']:.2f},總和約為 {meta['sum']:.2f}。") + elif meta.get('topValues'): + top = meta['topValues'][0] + findings.append(f"欄位「{name}」最常見值為「{top['value']}」,出現 {top['count']} 次。") + if len(findings) >= 6: + break + if not findings: + findings.append('資料已完成初步整理,但尚缺少足夠特徵以自動歸納具體發現。') + return findings + + +def build_method_text(summary: dict[str, Any]) -> str: + rows = summary.get('rows', 0) + cols = summary.get('columns', 0) + parsed_dates = summary.get('parsedDateColumns', []) + parts = [f"本研究以一份包含 {rows} 筆資料、{cols} 個欄位的資料集作為分析基礎。"] + if parsed_dates: + parts.append(f"其中已自動辨識日期欄位:{', '.join(parsed_dates)}。") + parts.append("分析流程包含欄位剖析、數值摘要、類別分布觀察,以及圖表化初步探索。") + return ''.join(parts) + + +def build_limitations(summary: dict[str, Any], mode: str) -> list[str]: + base = [ + '本版本內容依據自動分析結果生成,仍需依情境補充背景、語境與論證細節。', + '目前主要反映描述性分析與初步視覺化結果,尚未自動進行嚴格因果推論或完整驗證。', + ] + if mode == 'pitch': + base[0] = '本版本適合作為提案底稿,但對外簡報前仍需補上商業敘事、案例與風險說明。' + elif mode == 'business': + base[0] = '本版本可支援內部決策討論,但正式匯報前仍建議補充商務脈絡與對照基準。' + elif mode == 'academic': + base[0] = '本版本可作為論文或研究報告草稿,但正式提交前仍需補足文獻回顧、研究問題與方法論細節。' + if not summary.get('plots'): + base.append('本次分析未包含圖表產物,因此視覺化證據仍需後續補充。') + return base + + +def classify_plot(name: str) -> str: + low = name.lower() + if low.startswith('hist_'): + return 'histogram' + if low.startswith('bar_'): + return 'bar' + if low.startswith('line_'): + return 'line' + return 'plot' + + +def interpret_plot(plot: Path, mode: str) -> dict[str, str]: + kind = classify_plot(plot.name) + base = { + 'histogram': { + 'title': f'圖表解讀:{plot.name}', + 'summary': '這張 histogram 用來觀察數值欄位的分布狀態、集中區域與可能的離群位置。', + 'so_what': '若資料分布偏斜或過度集中,後續可考慮分群、分層或補充異常值檢查。', + }, + 'bar': { + 'title': f'圖表解讀:{plot.name}', + 'summary': '這張 bar chart 適合比較不同類別或分組之間的量體差異,幫助快速辨識高低落差。', + 'so_what': '若類別差異明顯,後續可針對高表現或低表現組別追查原因與策略。', + }, + 'line': { + 'title': f'圖表解讀:{plot.name}', + 'summary': '這張 line chart 用於觀察時間序列變化,幫助辨識趨勢、波動與可能轉折點。', + 'so_what': '若趨勢持續上升或下降,建議進一步比對外部事件、季節性與干預因素。', + }, + 'plot': { + 'title': f'圖表解讀:{plot.name}', + 'summary': '這張圖表提供一個視覺化切面,有助於快速掌握資料重點與分布特徵。', + 'so_what': '建議將圖表與主要論點對齊,補上更具體的背景解讀。', + }, + }[kind] + + if mode == 'pitch': + base['so_what'] = '簡報時應直接說明這張圖支持了哪個主張,以及它如何增加說服力。' + elif mode == 'business': + base['so_what'] = '建議把這張圖對應到 KPI、風險或下一步行動,方便管理層做判斷。' + elif mode == 'academic': + base['so_what'] = '建議將這張圖與研究問題、假設或比較基準一起討論,以提升論證完整度。' + return base + + +def build_insights(summary: dict[str, Any], plots: list[Path], mode: str) -> list[str]: + insights: list[str] = [] + numeric = [] + categorical = [] + for name, meta in summary.get('columnProfiles', {}).items(): + if 'mean' in meta and meta.get('mean') is not None: + numeric.append((name, meta)) + elif meta.get('topValues'): + categorical.append((name, meta)) + + for name, meta in numeric[:3]: + insights.append(f"數值欄位「{name}」平均約 {meta['mean']:.2f},範圍約 {meta['min']:.2f} 到 {meta['max']:.2f}。") + for name, meta in categorical[:2]: + top = meta['topValues'][0] + insights.append(f"類別欄位「{name}」目前以「{top['value']}」最常見({top['count']} 次),值得作為第一輪聚焦對象。") + if plots: + insights.append(f"本次已生成 {len(plots)} 張圖表,可直接支撐逐頁圖表解讀與口頭報告。") + + if mode == 'pitch': + insights.append('對外提案時,建議把最強的一項數據證據前置,讓聽眾先記住價值主張。') + elif mode == 'business': + insights.append('內部決策簡報時,建議把洞察轉成 KPI、優先順序與負責人。') + elif mode == 'academic': + insights.append('學術/研究情境下,建議將洞察進一步轉成研究問題、比較架構與後續驗證方向。') + return insights + + +def make_insights_md(title: str, mode: str, summary: dict[str, Any], plots: list[Path]) -> str: + insights = build_insights(summary, plots, mode) + plot_notes = [interpret_plot(p, mode) for p in plots] + lines = [f"# {title}|Insights", '', f"- 模式:`{mode}`", ''] + lines.append('## 關鍵洞察') + lines.extend([f"- {x}" for x in insights]) + lines.append('') + if plot_notes: + lines.append('## 圖表解讀摘要') + for note in plot_notes: + lines.append(f"### {note['title']}") + lines.append(f"- 解讀:{note['summary']}") + lines.append(f"- 延伸:{note['so_what']}") + lines.append('') + return '\n'.join(lines).strip() + '\n' + + +def make_paper(title: str, audience: str, purpose: str, mode: str, level: str, summary: dict[str, Any], report_md: str, plots: list[Path], insights_md: str | None = None) -> str: + findings = build_key_findings(summary) + method_text = build_method_text(summary) + limitations = build_limitations(summary, mode) + plot_refs = '\n'.join([f"- `{p.name}`" for p in plots]) or '- 無' + findings_md = '\n'.join([f"- {x}" for x in findings]) + limitations_md = '\n'.join([f"- {x}" for x in limitations]) + + if mode == 'academic': + sections = f"## 摘要\n\n本文面向{audience},以「{purpose}」為導向,整理目前資料分析結果並形成學術/研究草稿。\n\n## 研究背景與問題意識\n\n本文件根據既有分析產物自動整理,可作為研究報告、論文初稿或研究提案的起點。\n\n## 研究方法\n\n{method_text}\n\n## 研究發現\n\n{findings_md}\n\n## 討論\n\n目前結果可支撐初步描述性討論,後續可進一步補上研究假設、比較對照與方法嚴謹性。\n\n## 限制\n\n{limitations_md}\n\n## 結論\n\n本分析已形成研究性文件的結構基礎,適合進一步擴展為正式研究報告。" + elif mode == 'business': + sections = f"## 執行摘要\n\n本文面向{audience},目的是支援「{purpose}」的商務溝通與內部決策。\n\n## 商務背景\n\n本文件根據既有分析產物自動整理,適合作為內部簡報、策略討論或管理層報告的第一版。\n\n## 分析方法\n\n{method_text}\n\n## 關鍵洞察\n\n{findings_md}\n\n## 商業意涵\n\n目前資料已足以支撐一輪決策討論,建議進一步對照 KPI、目標值與外部環境。\n\n## 風險與限制\n\n{limitations_md}\n\n## 建議下一步\n\n建議針對最具決策價值的指標建立定期追蹤與後續驗證流程。" + else: + sections = f"## Pitch Summary\n\n本文面向{audience},用於支援「{purpose}」的提案、募資或說服型簡報。\n\n## Opportunity\n\n本文件根據既有分析產物自動整理,可作為提案 deck 與口頭簡報的第一版底稿。\n\n## Evidence\n\n{method_text}\n\n## Key Takeaways\n\n{findings_md}\n\n## Why It Matters\n\n目前結果已可形成明確敘事雛形,後續可補上市場機會、競品比較與具體行動方案。\n\n## Risks\n\n{limitations_md}\n\n## Ask / Next Step\n\n建議將數據證據、主張與下一步行動整合成對外一致的提案版本。" + + insight_section = '' + if insights_md: + insight_section = f"\n## 洞察摘要\n\n{insights_md}\n" + + return f"# {title}\n\n- 模式:`{mode}`\n- 等級:`{level}` — {LEVEL_NOTES[level]}\n- 對象:{audience}\n- 目的:{purpose}\n\n{sections}\n\n## 圖表與視覺化資產\n\n{plot_refs}{insight_section}\n## 附錄:原始自動分析摘要\n\n{report_md}\n" + + +def make_slides(title: str, audience: str, purpose: str, mode: str, summary: dict[str, Any], plots: list[Path], level: str) -> str: + findings = build_key_findings(summary) + rows = summary.get('rows', 0) + cols = summary.get('columns', 0) + + if mode == 'academic': + slides = [ + ('封面', [f'標題:{title}', f'對象:{audience}', f'目的:{purpose}', f'等級:{LEVEL_NOTES[level]}']), + ('研究問題', ['定義研究背景與核心問題', '說明本次分析欲回答的主題']), + ('資料概況', [f'資料筆數:{rows}', f'欄位數:{cols}', '已完成基本欄位剖析與摘要']), + ('方法', ['描述性統計', '類別分布觀察', '視覺化探索']), + ('研究發現', findings[:3]), + ('討論', ['解釋主要發現的可能意義', '連結研究問題與資料結果']), + ('限制', build_limitations(summary, mode)[:2]), + ('後續研究', ['補充文獻回顧', '加入比較基準與進階分析']), + ('結論', ['本份簡報可作為研究報告或論文簡報的第一版底稿']), + ] + elif mode == 'business': + slides = [ + ('封面', [f'標題:{title}', f'對象:{audience}', f'目的:{purpose}', f'等級:{LEVEL_NOTES[level]}']), + ('決策問題', ['這份分析要支援什麼決策', '為什麼現在需要處理']), + ('資料概況', [f'資料筆數:{rows}', f'欄位數:{cols}', '已完成基本資料盤點']), + ('分析方法', ['描述性統計', '類別分布觀察', '視覺化探索']), + ('關鍵洞察', findings[:3]), + ('商業意涵', ['把數據結果轉成管理層可理解的含義', '指出可能影響的目標或 KPI']), + ('風險與限制', build_limitations(summary, mode)[:2]), + ('建議行動', ['列出近期可執行事項', '定義需要追蹤的指標']), + ('結語', ['本份簡報可作為正式管理簡報的第一版底稿']), + ] + else: + slides = [ + ('封面', [f'標題:{title}', f'對象:{audience}', f'目的:{purpose}', f'等級:{LEVEL_NOTES[level]}']), + ('痛點 / 機會', ['說明這份分析解決什麼問題', '點出為什麼值得關注']), + ('證據基礎', [f'資料筆數:{rows}', f'欄位數:{cols}', '已完成資料摘要與圖表探索']), + ('方法', ['描述性統計', '類別觀察', '關鍵圖表整理']), + ('核心亮點', findings[:3]), + ('為什麼重要', ['連結價值、影響與說服力', '把發現轉成可傳達的敘事']), + ('風險', build_limitations(summary, mode)[:2]), + ('Next Step / Ask', ['明確提出下一步', '對齊資源、合作或決策需求']), + ('結語', ['本份 deck 可作為提案或募資簡報的第一版底稿']), + ] + + parts = [f"# {title}|簡報稿\n\n- 模式:`{mode}`\n- 等級:`{level}` — {LEVEL_NOTES[level]}\n"] + slide_no = 1 + for heading, bullets in slides: + parts.append(f"## Slide {slide_no} — {heading}") + parts.extend([f"- {x}" for x in bullets]) + parts.append('') + slide_no += 1 + + if level in {'v3', 'v4'} and plots: + for plot in plots: + note = interpret_plot(plot, mode) + parts.append(f"## Slide {slide_no} — {note['title']}") + parts.append(f"- 圖檔:{plot.name}") + parts.append(f"- 解讀:{note['summary']}") + parts.append(f"- 延伸:{note['so_what']}") + parts.append('') + slide_no += 1 + return '\n'.join(parts).strip() + '\n' + + +def make_speaker_notes(title: str, mode: str, summary: dict[str, Any], plots: list[Path], level: str) -> str: + findings = build_key_findings(summary) + findings_md = '\n'.join([f"- {x}" for x in findings]) + opener = { + 'academic': '先交代研究背景、研究問題與資料來源,再說明這份內容是研究草稿第一版。', + 'business': '先講這份分析支援哪個決策,再交代這份內容的管理價值與時間敏感性。', + 'pitch': '先抓住聽眾注意力,說明痛點、機會與這份資料為何值得相信。', + }[mode] + closer = { + 'academic': '結尾時回到研究限制與後續研究方向。', + 'business': '結尾時回到建議行動與追蹤機制。', + 'pitch': '結尾時回到 ask、資源需求與下一步承諾。', + }[mode] + parts = [ + f"# {title}|Speaker Notes", + '', + f"- 模式:`{mode}`", + f"- 等級:`{level}` — {LEVEL_NOTES[level]}", + '', + '## 開場', + f"- {opener}", + '', + '## 重點提示', + findings_md, + '', + ] + if level in {'v3', 'v4'} and plots: + parts.extend(['## 逐圖口頭提示', '']) + for plot in plots: + note = interpret_plot(plot, mode) + parts.append(f"### {plot.name}") + parts.append(f"- {note['summary']}") + parts.append(f"- {note['so_what']}") + parts.append('') + parts.extend(['## 收尾建議', f"- {closer}", '- 針對最重要的一張圖,多講一層其背後的意義與行動建議。', '']) + return '\n'.join(parts) + + +def make_deck_html(title: str, audience: str, purpose: str, slides_md: str, plots: list[Path], mode: str, level: str) -> str: + if level == 'v4': + theme = { + 'academic': {'primary': '#0f172a', 'accent': '#334155', 'bg': '#eef2ff', 'hero': 'linear-gradient(135deg,#0f172a 0%,#1e293b 55%,#475569 100%)'}, + 'business': {'primary': '#0b3b66', 'accent': '#1d4ed8', 'bg': '#eff6ff', 'hero': 'linear-gradient(135deg,#0b3b66 0%,#1d4ed8 60%,#60a5fa 100%)'}, + 'pitch': {'primary': '#4c1d95', 'accent': '#7c3aed', 'bg': '#faf5ff', 'hero': 'linear-gradient(135deg,#4c1d95 0%,#7c3aed 60%,#c084fc 100%)'}, + }[mode] + primary = theme['primary'] + accent = theme['accent'] + bg = theme['bg'] + hero = theme['hero'] + plot_map = {p.name: p for p in plots} + else: + primary = '#1f2937' + accent = '#2563eb' + bg = '#f6f8fb' + hero = None + plot_map = {p.name: p for p in plots} + + slide_blocks = [] + current = [] + current_title = None + for line in slides_md.splitlines(): + if line.startswith('## Slide '): + if current_title is not None: + slide_blocks.append((current_title, current)) + current_title = line.replace('## ', '', 1) + current = [] + elif line.startswith('- 模式:') or line.startswith('- 等級:') or line.startswith('# '): + continue + else: + current.append(line) + if current_title is not None: + slide_blocks.append((current_title, current)) + + sections = [] + for heading, body in slide_blocks: + body_html = [] + referenced_plot = None + for line in body: + line = line.strip() + if not line: + continue + if line.startswith('- 圖檔:'): + plot_name = line.replace('- 圖檔:', '', 1).strip() + referenced_plot = plot_map.get(plot_name) + body_html.append(f"
  • {html.escape(line[2:])}
  • ") + elif line.startswith('- '): + body_html.append(f"
  • {html.escape(line[2:])}
  • ") + else: + body_html.append(f"

    {html.escape(line)}

    ") + img_html = '' + if referenced_plot and level in {'v3', 'v4'}: + img_html = f"
    {html.escape(referenced_plot.name)}
    圖:{html.escape(referenced_plot.name)}
    " + list_items = ''.join(x for x in body_html if x.startswith('
  • ')) + paras = ''.join(x for x in body_html if x.startswith('

    ')) + list_html = f"

    " if list_items else '' + if level == 'v4': + sections.append( + f"
    {html.escape(mode.upper())}
    " + f"
    {html.escape(heading.split(' — ')[0])}

    {html.escape(heading)}

    {paras}{list_html}{img_html}
    " + ) + else: + sections.append(f"

    {html.escape(heading)}

    {paras}{list_html}{img_html}
    ") + + if level == 'v4': + css = f""" +@page {{ size: A4 landscape; margin: 0; }} +@media print {{ + body {{ background: #fff; padding: 0; }} + .slide {{ box-shadow: none; margin: 0; min-height: 100vh; border-radius: 0; page-break-after: always; page-break-inside: avoid; border-top-width: 16px; border-top-style: solid; border-top-color: {accent}; }} + .hero {{ box-shadow: none; margin: 0; min-height: 100vh; border-radius: 0; }} +}} +body {{ font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, 'Helvetica Neue', Arial, 'Noto Sans CJK TC', sans-serif; background: {bg}; margin: 0; padding: 32px; color: {primary}; }} +.hero {{ max-width: 1180px; margin: 0 auto 32px; padding: 56px 64px; border-radius: 32px; background: {hero}; color: white; box-shadow: 0 32px 64px rgba(15,23,42,.15); display: flex; flex-direction: column; justify-content: center; min-height: 500px; }} +.hero h1 {{ margin: 12px 0 20px; font-size: 52px; line-height: 1.2; letter-spacing: -0.02em; font-weight: 800; text-wrap: balance; }} +.hero p {{ margin: 8px 0; font-size: 20px; opacity: .9; font-weight: 400; }} +.hero-meta {{ display: grid; grid-template-columns: repeat(auto-fit, minmax(200px, 1fr)); gap: 16px; margin-top: 48px; }} +.hero-card {{ background: rgba(255,255,255,.1); border: 1px solid rgba(255,255,255,.2); border-radius: 20px; padding: 20px 24px; backdrop-filter: blur(10px); }} +.hero-card strong {{ display: block; font-size: 14px; text-transform: uppercase; letter-spacing: 0.05em; opacity: 0.8; margin-bottom: 6px; }} +.slide {{ background: #fff; border-radius: 32px; padding: 48px 56px; margin: 0 auto 32px; max-width: 1180px; min-height: 660px; box-shadow: 0 16px 48px rgba(15,23,42,.08); page-break-after: always; border-top: 16px solid {accent}; position: relative; overflow: hidden; display: flex; flex-direction: column; }} +.slide::after {{ content: ''; position: absolute; right: -80px; top: -80px; width: 240px; height: 240px; background: radial-gradient(circle, {bg} 0%, rgba(255,255,255,0) 70%); pointer-events: none; }} +.slide-top {{ display: flex; justify-content: space-between; align-items: center; margin-bottom: 24px; z-index: 1; }} +h1, h2 {{ margin-top: 0; font-weight: 700; }} +h2 {{ font-size: 36px; margin-bottom: 24px; color: {primary}; letter-spacing: -0.01em; }} +.slide p {{ font-size: 20px; line-height: 1.6; color: #334155; margin-bottom: 16px; }} +.slide ul {{ line-height: 1.6; font-size: 22px; padding-left: 28px; color: #1e293b; margin-top: 8px; flex-grow: 1; }} +.slide li {{ position: relative; padding-left: 8px; }} +.slide li + li {{ margin-top: 14px; }} +.slide li::marker {{ color: {accent}; font-weight: bold; }} +.eyebrow {{ display: inline-flex; align-items: center; padding: 8px 16px; border-radius: 999px; background: {bg}; color: {accent}; font-weight: 800; font-size: 13px; letter-spacing: .1em; box-shadow: 0 2px 8px rgba(0,0,0,0.04); }} +.page-tag {{ color: #94a3b8; font-size: 14px; font-weight: 700; text-transform: uppercase; letter-spacing: 0.05em; }} +.plot-single {{ margin-top: auto; text-align: center; padding-top: 24px; position: relative; display: flex; flex-direction: column; align-items: center; justify-content: center; }} +.plot-single img {{ max-width: 100%; max-height: 380px; border: 1px solid #e2e8f0; border-radius: 20px; background: #f8fafc; box-shadow: 0 12px 32px rgba(15,23,42,.06); padding: 8px; }} +.plot-caption {{ margin-top: 14px; font-size: 15px !important; color: #64748b !important; font-style: italic; text-align: center; background: #f1f5f9; padding: 6px 16px; border-radius: 999px; }} +""".strip() + hero_html = ( + f"
    {html.escape(mode.upper())}
    " + f"

    {html.escape(title)}

    適用對象:{html.escape(audience)}

    目的:{html.escape(purpose)}

    " + f"
    " + f"
    等級
    {html.escape(level)} — {html.escape(LEVEL_NOTES[level])}
    " + f"
    圖表數量
    {len(plots)}
    " + f"
    輸出定位
    正式 deck / PDF-ready
    " + f"
    " + ) + else: + css = f""" +body {{ font-family: Arial, 'Noto Sans CJK TC', sans-serif; background: {bg}; margin: 0; padding: 24px; color: {primary}; }} +.hero {{ max-width: 1100px; margin: 0 auto 24px; padding: 8px 6px; }} +.slide {{ background: #fff; border-radius: 18px; padding: 32px; margin: 0 auto 24px; max-width: 1100px; box-shadow: 0 8px 28px rgba(0,0,0,.08); page-break-after: always; border-top: 10px solid {accent}; }} +h1, h2 {{ margin-top: 0; }} +h1 {{ font-size: 40px; }} +ul {{ line-height: 1.7; }} +.plot-single {{ margin-top: 18px; text-align: center; }} +img {{ max-width: 100%; border: 1px solid #ddd; border-radius: 12px; background: #fff; }} +.plot-caption {{ margin-top: 10px; font-size: 14px; color: #6b7280; font-style: italic; }} +""".strip() + hero_html = ( + f"

    {html.escape(title)}

    對象:{html.escape(audience)}

    " + f"

    目的:{html.escape(purpose)}

    等級:{html.escape(level)} — {html.escape(LEVEL_NOTES[level])}

    " + ) + + return ( + "" + f"{html.escape(title)}" + + hero_html + + ''.join(sections) + + "" + ) + + +def main() -> int: + parser = argparse.ArgumentParser( + description='Generate paper/slides bundle from analysis outputs', + epilog=( + 'Levels: ' + 'v2=基礎交付版(paper/slides/speaker-notes/deck); ' + 'v3=洞察強化版(v2 + insights + 每張圖逐頁解讀); ' + 'v4=正式交付版(v3 + 更正式 deck 視覺 + PDF-ready 工作流)' + ), + ) + parser.add_argument('--analysis-dir', required=True) + parser.add_argument('--output-dir', required=True) + parser.add_argument('--title', default='研究分析草稿') + parser.add_argument('--audience', default='決策者') + parser.add_argument('--purpose', default='研究報告') + parser.add_argument('--mode', default='business', choices=sorted(MODES)) + parser.add_argument( + '--level', + default='v4', + choices=sorted(LEVELS), + help='輸出等級:v2=基礎交付版;v3=洞察強化版;v4=正式交付版(預設)', + ) + args = parser.parse_args() + + analysis_dir = Path(args.analysis_dir).expanduser().resolve() + output_dir = Path(args.output_dir).expanduser().resolve() + output_dir.mkdir(parents=True, exist_ok=True) + + summary_path = analysis_dir / 'summary.json' + report_path = analysis_dir / 'report.md' + if not summary_path.exists(): + raise SystemExit(f'Missing summary.json in {analysis_dir}') + if not report_path.exists(): + raise SystemExit(f'Missing report.md in {analysis_dir}') + + summary = read_json(summary_path) + report_md = read_text(report_path) + plots = find_plots(analysis_dir) + insights_md = make_insights_md(args.title, args.mode, summary, plots) if args.level in {'v3', 'v4'} else None + + paper_md = make_paper(args.title, args.audience, args.purpose, args.mode, args.level, summary, report_md, plots, insights_md) + slides_md = make_slides(args.title, args.audience, args.purpose, args.mode, summary, plots, args.level) + speaker_notes = make_speaker_notes(args.title, args.mode, summary, plots, args.level) + deck_html = make_deck_html(args.title, args.audience, args.purpose, slides_md, plots, args.mode, args.level) + + for plot in plots: + dest = output_dir / plot.name + if dest != plot: + shutil.copy2(plot, dest) + + (output_dir / 'paper.md').write_text(paper_md, encoding='utf-8') + (output_dir / 'slides.md').write_text(slides_md, encoding='utf-8') + (output_dir / 'speaker-notes.md').write_text(speaker_notes, encoding='utf-8') + (output_dir / 'deck.html').write_text(deck_html, encoding='utf-8') + if insights_md: + (output_dir / 'insights.md').write_text(insights_md, encoding='utf-8') + + manifest_outputs = { + 'paper': str(output_dir / 'paper.md'), + 'slides': str(output_dir / 'slides.md'), + 'speakerNotes': str(output_dir / 'speaker-notes.md'), + 'deckHtml': str(output_dir / 'deck.html'), + } + if insights_md: + manifest_outputs['insights'] = str(output_dir / 'insights.md') + + manifest = { + 'title': args.title, + 'audience': args.audience, + 'purpose': args.purpose, + 'mode': args.mode, + 'level': args.level, + 'levelNote': LEVEL_NOTES[args.level], + 'analysisDir': str(analysis_dir), + 'outputs': manifest_outputs, + 'plots': [str(p) for p in plots], + } + (output_dir / 'bundle.json').write_text(json.dumps(manifest, ensure_ascii=False, indent=2), encoding='utf-8') + print(json.dumps(manifest, ensure_ascii=False, indent=2)) + return 0 + + +if __name__ == '__main__': + raise SystemExit(main()) diff --git a/skills/skill-review/SKILL.md b/skills/skill-review/SKILL.md index f30259c..fd3ae44 100644 --- a/skills/skill-review/SKILL.md +++ b/skills/skill-review/SKILL.md @@ -23,6 +23,7 @@ tools: - `GITEA_URL`: Gitea 基礎 URL(https://git.nature.edu.kg) - `GITEA_TOKEN_`: 你的 Gitea API token(根據 agent ID 取對應的) - Agent → Gitea 帳號對應: + - main → `xiaoming`(小明,專案管理/綜合審查) - tiangong → `tiangong`(天工,架構/安全) - kaiwu → `kaiwu`(開物,UX/前端) - yucheng → `yucheng`(玉成,全棧/測試) @@ -31,6 +32,12 @@ tools: 根據你的角色,重點審查不同面向: +### 小明(main)— 專案經理 +- 整體 skill 的完整性與一致性 +- SKILL.md 描述是否清楚、trigger 是否遺漏常見用法 +- 跨 skill 的重複邏輯或可整合之處 +- 文件與實作是否同步 + ### 天工(tiangong)— 架構設計師 - SKILL.md 的 trigger 設計是否合理、會不會誤觸發 - handler.ts 的錯誤處理、邊界情況 diff --git a/skills/skill-review/handler.ts b/skills/skill-review/handler.ts index 0fcb82d..0a757c0 100644 --- a/skills/skill-review/handler.ts +++ b/skills/skill-review/handler.ts @@ -9,6 +9,7 @@ const REPO_NAME = 'openclaw-skill'; // Agent ID → Gitea 帳號 & token 環境變數對應 const AGENT_MAP: Record = { + main: { username: 'xiaoming', tokenEnv: 'GITEA_TOKEN_XIAOMING' }, tiangong: { username: 'tiangong', tokenEnv: 'GITEA_TOKEN_TIANGONG' }, kaiwu: { username: 'kaiwu', tokenEnv: 'GITEA_TOKEN_KAIWU' }, yucheng: { username: 'yucheng', tokenEnv: 'GITEA_TOKEN_YUCHENG' }, diff --git a/skills/summarize b/skills/summarize new file mode 120000 index 0000000..1d00030 --- /dev/null +++ b/skills/summarize @@ -0,0 +1 @@ +/home/selig/.openclaw/workspace/skills/summarize \ No newline at end of file diff --git a/skills/tavily-tool/.clawhub/origin.json b/skills/tavily-tool/.clawhub/origin.json new file mode 100644 index 0000000..0b1f4f7 --- /dev/null +++ b/skills/tavily-tool/.clawhub/origin.json @@ -0,0 +1,7 @@ +{ + "version": 1, + "registry": "https://clawhub.ai", + "slug": "tavily-tool", + "installedVersion": "0.1.1", + "installedAt": 1773199294594 +} diff --git a/skills/tavily-tool/SKILL.md b/skills/tavily-tool/SKILL.md new file mode 100644 index 0000000..0dedf12 --- /dev/null +++ b/skills/tavily-tool/SKILL.md @@ -0,0 +1,46 @@ +--- +name: tavily +description: Use Tavily web search/discovery to find URLs/sources, do lead research, gather up-to-date links, or produce a cited summary from web results. +metadata: {"openclaw":{"requires":{"env":["TAVILY_API_KEY"]},"primaryEnv":"TAVILY_API_KEY"}} +--- + +# Tavily + +Use the bundled CLI to run Tavily searches from the terminal and collect sources fast. + +## Quick start (CLI) + +The scripts **require** `TAVILY_API_KEY` in the environment (sent as `Authorization: Bearer ...`). + +```bash +export TAVILY_API_KEY="..." +node skills/tavily/scripts/tavily_search.js --query "best rust http client" --max_results 5 +``` + +- JSON response is printed to **stdout**. +- A simple URL list is printed to **stderr** by default. + +## Common patterns + +### Get URLs only + +```bash +export TAVILY_API_KEY="..." +node skills/tavily/scripts/tavily_search.js --query "OpenTelemetry collector config" --urls-only +``` + +### Restrict to (or exclude) specific domains + +```bash +export TAVILY_API_KEY="..." +node skills/tavily/scripts/tavily_search.js \ + --query "oauth device code flow" \ + --include_domains oauth.net,datatracker.ietf.org \ + --exclude_domains medium.com +``` + +## Notes + +- The bundled CLI supports a subset of Tavily’s request fields (query, max_results, include_domains, exclude_domains). +- For API field notes and more examples, read: `references/tavily-api.md`. +- Wrapper script (optional): `scripts/tavily_search.sh`. diff --git a/skills/tavily-tool/_meta.json b/skills/tavily-tool/_meta.json new file mode 100644 index 0000000..327e6d0 --- /dev/null +++ b/skills/tavily-tool/_meta.json @@ -0,0 +1,6 @@ +{ + "ownerId": "kn78x7kg14jggfbz385es5bdrn81ddgw", + "slug": "tavily-tool", + "version": "0.1.1", + "publishedAt": 1772290357545 +} \ No newline at end of file diff --git a/skills/tavily-tool/references/tavily-api.md b/skills/tavily-tool/references/tavily-api.md new file mode 100644 index 0000000..ecdf1c7 --- /dev/null +++ b/skills/tavily-tool/references/tavily-api.md @@ -0,0 +1,55 @@ +# Tavily API notes (quick reference) + +## Endpoint + +- Search: `POST https://api.tavily.com/search` + +## Auth + +- Send the API key via HTTP header: `Authorization: Bearer `. +- This skill’s scripts read the key from **env var only**: `TAVILY_API_KEY`. + +## Common request fields + +```json +{ + "query": "...", + "max_results": 5, + "include_domains": ["example.com"], + "exclude_domains": ["spam.com"] +} +``` + +(Additional Tavily options exist; this skill’s CLI supports only a common subset for discovery use-cases.) + +## Script usage + +### JSON output (stdout) + URL list (stderr) + +```bash +export TAVILY_API_KEY="..." +node skills/tavily/scripts/tavily_search.js --query "best open source vector database" --max_results 5 +``` + +### URLs only + +```bash +export TAVILY_API_KEY="..." +node skills/tavily/scripts/tavily_search.js --query "SvelteKit tutorial" --urls-only +``` + +### Include / exclude domains + +```bash +export TAVILY_API_KEY="..." +node skills/tavily/scripts/tavily_search.js \ + --query "websocket load testing" \ + --include_domains k6.io,github.com \ + --exclude_domains medium.com +``` + +## Notes + +- Exit code `2` indicates missing required args or missing `TAVILY_API_KEY`. +- Exit code `3` indicates network/HTTP failure. +- Exit code `4` indicates a non-JSON response. diff --git a/skills/tavily-tool/scripts/tavily_search.js b/skills/tavily-tool/scripts/tavily_search.js new file mode 100644 index 0000000..b9f083d --- /dev/null +++ b/skills/tavily-tool/scripts/tavily_search.js @@ -0,0 +1,161 @@ +#!/usr/bin/env node +/** + * Tavily Search CLI + * + * - Reads TAVILY_API_KEY from env only. + * - Prints full JSON response to stdout. + * - Prints a simple list of URLs to stderr by default (can be disabled). + */ + +const TAVILY_ENDPOINT = 'https://api.tavily.com/search'; + +function usage(msg) { + if (msg) console.error(`Error: ${msg}\n`); + console.error(`Usage: + tavily_search.js --query "..." [--max_results 5] [--include_domains a.com,b.com] [--exclude_domains x.com,y.com] + +Options: + --query, -q Search query (required) + --max_results, -n Max results (default: 5; clamped to 0..20) + --include_domains Comma-separated domains to include + --exclude_domains Comma-separated domains to exclude + --urls-stderr Print URL list to stderr (default: true) + --no-urls-stderr Disable URL list to stderr + --urls-only Print URLs (one per line) to stdout instead of JSON + --help, -h Show help + +Env: + TAVILY_API_KEY (required) Tavily API key + +Exit codes: + 0 success + 2 usage / missing required inputs + 3 network / HTTP error + 4 invalid JSON response +`); +} + +function parseArgs(argv) { + const out = { + query: null, + max_results: 5, + include_domains: null, + exclude_domains: null, + urls_stderr: true, + urls_only: false, + help: false, + }; + + for (let i = 0; i < argv.length; i++) { + const a = argv[i]; + if (a === '--help' || a === '-h') out.help = true; + else if (a === '--query' || a === '-q') out.query = argv[++i]; + else if (a === '--max_results' || a === '-n') out.max_results = Number(argv[++i]); + else if (a === '--include_domains') out.include_domains = argv[++i]; + else if (a === '--exclude_domains') out.exclude_domains = argv[++i]; + else if (a === '--urls-stderr') out.urls_stderr = true; + else if (a === '--no-urls-stderr') out.urls_stderr = false; + else if (a === '--urls-only') out.urls_only = true; + else return { error: `Unknown arg: ${a}` }; + } + + if (Number.isNaN(out.max_results) || !Number.isFinite(out.max_results)) { + return { error: `--max_results must be a number` }; + } + // Tavily allows 0..20; clamp to stay in range. + out.max_results = Math.max(0, Math.min(20, Math.trunc(out.max_results))); + + const csvToArray = (s) => { + if (!s) return null; + const arr = s.split(',').map(x => x.trim()).filter(Boolean); + return arr.length ? arr : null; + }; + + out.include_domains = csvToArray(out.include_domains); + out.exclude_domains = csvToArray(out.exclude_domains); + + return out; +} + +async function main() { + const args = parseArgs(process.argv.slice(2)); + if (args.error) { + usage(args.error); + process.exit(2); + } + if (args.help) { + usage(); + process.exit(0); + } + + const apiKey = process.env.TAVILY_API_KEY; + if (!apiKey) { + usage('TAVILY_API_KEY env var is required'); + process.exit(2); + } + + if (!args.query) { + usage('--query is required'); + process.exit(2); + } + + const payload = { + query: args.query, + max_results: args.max_results, + }; + if (args.include_domains) payload.include_domains = args.include_domains; + if (args.exclude_domains) payload.exclude_domains = args.exclude_domains; + + let res; + try { + res = await fetch(TAVILY_ENDPOINT, { + method: 'POST', + headers: { + 'Content-Type': 'application/json', + 'Authorization': `Bearer ${apiKey}`, + }, + body: JSON.stringify(payload), + }); + } catch (e) { + console.error(`Network error calling Tavily: ${e?.message || String(e)}`); + process.exit(3); + } + + if (!res.ok) { + let bodyText = ''; + try { bodyText = await res.text(); } catch {} + console.error(`Tavily HTTP error: ${res.status} ${res.statusText}`); + if (bodyText) console.error(bodyText); + process.exit(3); + } + + let data; + try { + data = await res.json(); + } catch (e) { + console.error(`Invalid JSON response from Tavily: ${e?.message || String(e)}`); + process.exit(4); + } + + const urls = Array.isArray(data?.results) + ? data.results.map(r => r?.url).filter(Boolean) + : []; + + if (args.urls_only) { + for (const u of urls) process.stdout.write(`${u}\n`); + process.exit(0); + } + + process.stdout.write(JSON.stringify(data, null, 2)); + process.stdout.write('\n'); + + if (args.urls_stderr && urls.length) { + console.error('\nURLs:'); + for (const u of urls) console.error(u); + } +} + +main().catch((e) => { + console.error(`Unexpected error: ${e?.stack || e?.message || String(e)}`); + process.exit(1); +}); diff --git a/skills/tavily-tool/scripts/tavily_search.sh b/skills/tavily-tool/scripts/tavily_search.sh new file mode 100644 index 0000000..1eea70b --- /dev/null +++ b/skills/tavily-tool/scripts/tavily_search.sh @@ -0,0 +1,9 @@ +#!/usr/bin/env bash +set -euo pipefail + +# Wrapper to run the Node Tavily search CLI. +# Usage: +# TAVILY_API_KEY=... ./tavily_search.sh --query "..." --max_results 5 + +DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +exec node "$DIR/tavily_search.js" "$@"