This document is a readable overview of the repository structure and key modules. The complete, runnable source code including practice data and tests is on GitHub.
Repository structure
intelligent-financial-close/
├── src/
│ ├── ml_engine/
│ │ ├── anomaly_detector.py
│ │ ├── predictive_model.py
│ │ └── data_processor.py
│ ├── automation/
│ │ ├── workflow_manager.py
│ │ ├── task_scheduler.py
│ │ └── progress_tracker.py
│ └── api/
│ ├── app.py
│ ├── routes.py
│ └── database.py
├── data/ # Practice ledger data
├── tests/
├── docs/
└── requirements.txt
Key source files
ML engine anomaly detector
The core anomaly detection module. Uses a Z-score threshold of 2.5σ as the default, with an IsolationForest as an alternative baseline for comparison.
# anomaly_detector.py
import numpy as np
from sklearn.ensemble import IsolationForest
from sklearn.preprocessing import StandardScaler
class FinancialAnomalyDetector:
def __init__(self, contamination=0.1, random_state=42):
self.contamination = contamination
self.model = IsolationForest(
contamination=contamination,
random_state=random_state,
n_estimators=100,
)
self.scaler = StandardScaler()
self.is_fitted = False
def fit(self, X):
X_scaled = self.scaler.fit_transform(X)
self.model.fit(X_scaled)
self.is_fitted = True
def predict(self, X):
if not self.is_fitted:
raise ValueError("Model must be fitted before prediction")
X_scaled = self.scaler.transform(X)
predictions = self.model.predict(X_scaled)
anomaly_scores = self.model.score_samples(X_scaled)
return predictions, anomaly_scores
Workflow manager
Models the 17-task close sequence with declared dependencies. Any task whose upstream dependency has not completed is held; the bottleneck becomes visible in real time rather than at cycle end.
# workflow_manager.py
import json
from datetime import datetime
from enum import Enum
class TaskStatus(Enum):
PENDING = "pending"
IN_PROGRESS = "in_progress"
COMPLETED = "completed"
FAILED = "failed"
class WorkflowManager:
def __init__(self, config_path):
with open(config_path, "r") as f:
self.config = json.load(f)
self.tasks = {}
self.initialize_tasks()
def initialize_tasks(self):
for task_config in self.config["tasks"]:
task = Task(task_config)
self.tasks[task.id] = task
def execute_workflow(self):
while not self.is_complete():
ready_tasks = self.get_ready_tasks()
for task in ready_tasks:
self.execute_task(task)
API metrics endpoint
Exposes model outputs for the dashboard. The hardcoded figures below reflect prototype scenario outputs, not claims about client performance.
# app.py
from flask import Flask, jsonify
from flask_cors import CORS
import sqlite3
app = Flask(__name__)
CORS(app)
@app.route("/api/anomalies", methods=["GET"])
def get_anomalies():
conn = sqlite3.connect("financial_data.db")
cursor = conn.cursor()
cursor.execute("""
SELECT * FROM transactions
WHERE anomaly_score < -0.5
ORDER BY transaction_date DESC
""")
anomalies = cursor.fetchall()
conn.close()
return jsonify({"anomalies": anomalies, "count": len(anomalies)})
@app.route("/api/metrics", methods=["GET"])
def get_metrics():
# Prototype scenario outputs. See README for methodology.
return jsonify({
"automation_rate": 73.7,
"ml_confidence": 87,
"close_tasks_sequenced": 17,
"anomalies_detected": 23,
})
Installation
# Create and activate virtual environment
python -m venv venv
source venv/bin/activate # macOS/Linux
# Install dependencies
pip install -r requirements.txt
# Initialise database with practice data
python init_database.py
# Run tests
python -m pytest tests/ -v
Dependencies
scikit-learn==1.3.0
pandas==2.0.3
numpy==1.24.3
Flask==2.3.3
Flask-CORS==4.0.0
python-dotenv==1.0.0
The complete repository, including generation scripts, Jupyter notebooks, and test suite, is on GitHub.