AI Insights

Signal feed

AI stories, scored and filtered.

Live items from our monitored sources, filtered for signal and annotated with a recommended posture for enterprise leaders.

844 stories

  1. 24 OctEXPLORE

    Deploy Embedding Models with Hugging Face Inference Endpoints

    Hugging Face Blog

    Hugging Face announced new inference endpoints specifically for deploying embedding models, targeting enterprise use cases.

    Why it matters

    Hugging Face's dedicated embedding model inference endpoints simplify deployment and potentially reduce the operational overhead for critical RAG components in G-SIB AI applications.

    Hype4/10
  2. 15 OctEXPLORE

    Reflections on AI Engineer Summit 2023

    Eugene Yan

    Reflections from the AI Engineer Summit highlight deployment challenges, backward compatibility, and multi-modality.

    Why it matters

    Insights into AI deployment challenges from leading practitioners confirm that G-SIBs face similar integration and scalability hurdles with frontier models.

    Hype4/10
  3. 11 OctEXPLORE

    OpenAI’s technology explained

    OpenAI News

    OpenAI published a general explanation of its core technologies, including model architectures, training processes, and safety principles.

    Why it matters

    Understanding OpenAI's foundational explanations supports internal model risk governance and validation frameworks for models built on their APIs.

    Hype4/10
  4. 11 OctEXPLORE

    Simplifying contract reviews with AI

    OpenAI News

    Ironclad uses OpenAI's GPT-4 to streamline the contract review process, demonstrating application in legal tech.

    Why it matters

    This use case reinforces the immediate applicability of commercial LLMs for G-SIB-relevant document processing, particularly in legal and compliance.

    Hype4/10
  5. 11 OctEXPLORE

    Building AI-powered apps for business

    OpenAI News

    OpenAI highlights Retool's low-code platform for secure, rapid development of business applications using GPT-4.

    Why it matters

    Low-code platforms integrating LLMs like Retool enable faster prototyping and deployment of internal business applications, impacting your 'build-vs-buy' strategy for departmental AI solutions.

    Hype6/10
  6. 11 OctEXPLORE

    Evolving online forms into dynamic data

    OpenAI News

    Typeform claims to use GPT-3.5 and GPT-4 to convert traditional online forms into dynamic, conversational data collection experiences.

    Why it matters

    This suggests a vendor-led approach to modernizing critical data intake processes, potentially reducing manual data entry and improving customer experience for G-SIBs.

    Hype6/10
  7. 10 OctEXPLORE

    Multimodality and Large Multimodal Models (LMMs)

    Chip Huyen

    Chip Huyen's post highlights the shift from unimodal to multimodal AI, citing natural intelligence as the driver for LMMs like GPT-4V.

    Why it matters

    Multimodal models will expand AI's capability beyond text, image, or audio to process complex, real-world banking data inputs, impacting use case scope and model validation complexity.

    Hype4/10
  8. 9 OctEXPLORE

    AI Engineer 2023 Keynote - Building Blocks for LLM Systems

    Eugene Yan

    Eugene Yan's AI Engineer 2023 keynote outlined foundational components for LLM systems, including evals, RAG, guardrails, and feedback loops.

    Why it matters

    This keynote consolidates current best practices for building robust LLM systems, validating the components G-SIBs are already integrating into their production pipelines.

    Hype4/10
  9. 4 OctEXPLORE

    Accelerating over 130,000 Hugging Face models with ONNX Runtime

    Hugging Face Blog

    Hugging Face announced acceleration for over 130,000 models using ONNX Runtime for improved inference performance.

    Why it matters

    This initiative provides a standardized, efficient path for optimizing a vast range of open-source models, directly impacting inference costs and deployment speed for G-SIBs leveraging Hugging Face assets.

    Hype4/10
  10. 28 SeptEXPLORE

    Non-engineers guide: Train a LLaMA 2 chatbot

    Hugging Face Blog

    Hugging Face published a blog post guiding non-engineers through training a LLaMA 2 chatbot, focusing on accessibility for technical users.

    Why it matters

    The increasing ease of fine-tuning open-source LLMs like LLaMA 2 means internal citizen data scientists can contribute to model development if proper guardrails are established.

    Hype4/10
  11. 26 SeptEXPLORE

    Llama 2 on Amazon SageMaker a Benchmark

    Hugging Face Blog

    Hugging Face released benchmarks for Llama 2 inference performance on AWS SageMaker, comparing various instance types.

    Why it matters

    Optimized Llama 2 inference on SageMaker provides G-SIBs with a clear baseline for cost-effective deployment of open-source LLMs in a managed cloud environment.

    Hype4/10
  12. 25 SeptEXPLORE

    GPT-4V(ision) system card

    OpenAI News

    OpenAI released a system card for GPT-4V, detailing capabilities, limitations, and safety considerations for multimodal applications.

    Why it matters

    The GPT-4V system card outlines critical safety considerations for multimodal AI, directly informing your model risk frameworks for future vision-enabled applications in banking.

    Hype5/10
  13. 19 SeptEXPLORE

    OpenAI Red Teaming Network

    OpenAI News

    OpenAI announced an open call for a Red Teaming Network, inviting domain experts to improve model safety.

    Why it matters

    This initiative provides G-SIBs a potential avenue to contribute to frontier model safety and influence vendor security practices, directly impacting downstream model risk assessments.

    Hype4/10
  14. 19 SeptEXPLORE

    Rocket Money x Hugging Face: Scaling Volatile ML Models in Production​

    Hugging Face Blog

    Rocket Money leveraged Hugging Face to manage and scale ML models in production, focusing on handling model volatility.

    Why it matters

    Rocket Money's experience with Hugging Face for scaling volatile ML models provides a relevant peer example for G-SIBs managing large-scale inference and model stability.

    Hype4/10
  15. 15 SeptEXPLORE

    Optimizing your LLM in production

    Hugging Face Blog

    Hugging Face published a blog on LLM optimization techniques covering quantization, distillation, and efficient inference for production deployments.

    Why it matters

    Efficiently deploying LLMs in production is a primary cost and latency driver for any G-SIB scaling generative AI applications.

    Hype4/10
  16. 13 SeptEXPLORE

    Fine-tuning Llama 2 70B using PyTorch FSDP

    Hugging Face Blog

    Hugging Face detailed fine-tuning Llama 2 70B with PyTorch FSDP, showcasing a method for distributed training on open-source LLMs.

    Why it matters

    This technical guide provides a concrete blueprint for G-SIBs considering fine-tuning open-source Llama 2 70B models with existing PyTorch infrastructure to leverage sensitive internal data.

    Hype4/10
  17. 6 SeptEXPLORE

    Join us for OpenAI’s first developer conference on November 6 in San Francisco

    OpenAI News

    OpenAI announced its first developer conference, 'DevDay,' scheduled for November 6 in San Francisco, with a livestream keynote.

    Why it matters

    OpenAI's first developer conference signals major product announcements, likely including new models, API features, and pricing structures that will directly impact your bank's vendor strategy and build-vs-buy decisions.

    Hype6/10
  18. 1 SeptEXPLORE

    Fetch Cuts ML Processing Latency by 50% Using Amazon SageMaker & Hugging Face

    Hugging Face Blog

    Fetch reduced ML processing latency by 50% leveraging Amazon SageMaker and Hugging Face infrastructure, indicating potential for optimization.

    Why it matters

    Optimizing ML processing latency by 50% using common cloud and open-source tooling demonstrates a tangible performance improvement applicable to high-volume banking use cases, particularly in areas like real-time fraud detection or algorithmic trading.

    Hype4/10
  19. 25 AugEXPLORE

    Code Llama: Llama 2 learns to code

    Hugging Face Blog

    Meta released Code Llama, a large language model fine-tuned for code generation, available in several variants including Python-specific.

    Why it matters

    Code Llama offers a strong open-source option for G-SIBs to evaluate against proprietary models for internal developer tooling, potentially reducing licensing costs and increasing control.

    Hype4/10
  20. 24 AugEXPLORE

    OpenAI partners with Scale to provide support for enterprises fine-tuning models

    OpenAI News

    OpenAI announced a partnership with Scale AI to offer fine-tuning services for enterprises utilizing OpenAI's advanced models.

    Why it matters

    This partnership offers G-SIBs an assisted pathway to fine-tune OpenAI models, potentially simplifying bespoke model development while raising questions about data handling and IP retention.

    Hype5/10
  21. 22 AugEXPLORE

    GPT-3.5 Turbo fine-tuning and API updates

    OpenAI News

    OpenAI announced the general availability of fine-tuning for GPT-3.5 Turbo, allowing developers to customize the model with proprietary data.

    Why it matters

    Fine-tuning for GPT-3.5 Turbo moves more use cases from 'research with large models' to 'production with cost-effective models' for your organization.

    Hype4/10
  22. 15 AugEXPLORE

    Using GPT-4 for content moderation

    OpenAI News

    OpenAI claims to use GPT-4 for content policy definition and moderation, improving consistency and reducing human intervention.

    Why it matters

    OpenAI's internal deployment of GPT-4 for policy enforcement highlights a potential pathway for G-SIBs to automate compliance and operational risk controls beyond current rule-based systems.

    Hype5/10
  23. 13 AugEXPLORE

    How to Match LLM Patterns to Problems

    Eugene Yan

    Eugene Yan outlines a framework for matching LLM patterns (e.g., external/internal, data/non-data) to enterprise problem types.

    Why it matters

    This framework offers a structured approach to initial solution design, directly informing the build-vs-buy decision and model deployment strategy for enterprise use cases.

    Hype4/10
  24. 10 AugEXPLORE

    Hugging Face Hub on the AWS Marketplace: Pay with your AWS Account

    Hugging Face Blog

    Hugging Face Hub services are now available on AWS Marketplace, allowing enterprises to pay through existing AWS accounts.

    Why it matters

    Easier procurement for Hugging Face services through AWS Marketplace simplifies budget allocation and legal review for G-SIBs already operating on AWS.

    Hype3/10
  25. 9 AugEXPLORE

    Deploying Hugging Face Models with BentoML: DeepFloyd IF in Action

    Hugging Face Blog

    Hugging Face blog post demonstrates deploying DeepFloyd IF with BentoML for local inference, highlighting open-source model operationalization.

    Why it matters

    The detailed example of operationalizing a specific open-source model with BentoML provides a concrete reference architecture for G-SIBs exploring internal inference capabilities.

    Hype4/10
  26. 8 AugEXPLORE

    Fine-tune Llama 2 with DPO

    Hugging Face Blog

    Hugging Face published a tutorial on fine-tuning Llama 2 using Direct Preference Optimization (DPO) for improved alignment.

    Why it matters

    This tutorial offers a practical, well-documented pathway for G-SIBs to custom-align open-source Llama 2 models with specific banking data and compliance requirements, potentially reducing reliance on larger, closed models for certain tasks.

    Hype4/10
  27. 2 AugEXPLORE

    Towards Encrypted Large Language Models with FHE

    Hugging Face Blog

    Hugging Face researchers published a blog post outlining the potential for Fully Homomorphic Encryption (FHE) to secure LLM inference.

    Why it matters

    Fully Homomorphic Encryption offers a theoretical pathway to perform LLM inference on encrypted data, significantly enhancing data privacy and security for sensitive banking workloads.

    Hype4/10
  28. 30 JulEXPLORE

    Patterns for Building LLM-based Systems & Products

    Eugene Yan

    Eugene Yan outlines common architectural patterns for LLM systems, including RAG, fine-tuning, caching, guardrails, and defensive UX.

    Why it matters

    This compilation of established LLM patterns reinforces the standardized, production-grade components required for robust enterprise AI deployments.

    Hype4/10
  29. 24 JulEXPLORE

    AI Policy @🤗: Open ML Considerations in the EU AI Act

    Hugging Face Blog

    Hugging Face published an analysis of the EU AI Act's implications for open-source AI, focusing on potential compliance burdens.

    Why it matters

    Hugging Face's detailed critique of the EU AI Act's scope around open-source models informs your bank's regulatory interpretation and build-vs-buy strategy for foundation models.

    Hype4/10
  30. 18 JulEXPLORE

    Llama 2 is here - get it on Hugging Face

    Hugging Face Blog

    Meta released Llama 2, an open-source large language model, available on Hugging Face, enabling broader access and fine-tuning capabilities.

    Why it matters

    Llama 2's open-source availability and permissive license offer G-SIBs an alternative for on-premise model deployment and fine-tuning, directly impacting build-vs-buy decisions and vendor lock-in risk.

    Hype5/10