FINOS Events

2025-09-19 - AI Evaluation & Benchmarking Workshop

Written by FINOS Team | Aug 8, 2025 2:46:39 PM

Help set the industry standard for trustworthy, production‑ready GenAI in financial services.

REGISTER HERE

Why this workshop, why now?

Modern LLM‑powered solutions can’t move from proof‑of‑concept to production unless they are measured in ways that build trust and confidence. FINOS has already boot‑strapped an Open Financial LLM Leaderboard that assesses 35 datasets across 23 finance‑specific tasks, but contributors now need a clear, community‑owned process to expand that work into a full Evaluation & Benchmarking Suite.

This scoping workshop is the official kick‑off for that effort. By the end of a highly‑interactive half day you will have:

  1. A consensus‑driven Project Scope Statement that defines what the suite is and is not.

  2. A negotiated “Must‑Have” feature backlog and a theme‑based public roadmap to Q4 2025 and beyond.

  3. A clear path for contributing new datasets, tasks and risk metrics under FINOS governance.

Who should attend?

  1. Financial‑services technologists building or evaluating GenAI solutions

  2. Model‑risk, compliance & audit professionals aligning with EU AI Act and other regulations

  3. Data‑science researchers & academics focusing on domain‑specific benchmarks

  4. Open‑source contributors & vendors offering evaluation tooling or datasets

Key take‑aways

  1. Practical artefacts ready to merge into the FINOS repo: README, CONTRIBUTING, GitHub issues.

  2. Peer network across banks, vendors, academia and regulators shaping open standards for GenAI assurance.

  3. Early influence on the next features of the Open Financial LLM Leaderboard and future evaluation tooling.

Logistics

  1. Date:  19 Sept 2025

  2. Time: 13:00 – 17:00 BST

  3. Location: London, Location TBA

  4. Cost: Free – registration required