As developers continue to develop and deploy AI applications at scale across organizations, Azure is committed to delivering unprecedented choice in models as well as a flexible and comprehensive toolchain to handle the unique, complex and diverse needs of modern enterprises. This powerful combination of the latest models and cutting-edge tooling empowers developers to create highly-customized solutions grounded in their organization’s data. That’s why we are excited to announce several updates to help developers quickly create AI solutions with greater choice and flexibility leveraging the Azure AI toolchain:
Where innovators are creating the future
We are introducing a new model to the Phi family, Phi-3.5-MoE, a Mixture of Experts (MoE) model. This new model combines 16 smaller experts into one, which delivers improvements in model quality and lower latency. While the model is 42B parameters, since it is an MoE model it only uses 6.6B active parameters at a time, by being able to specialize a subset of the parameters (experts) during training, and then at runtime use the relevant experts for the task. This approach gives customers the benefit of the speed and computational efficiency of a small model with the domain knowledge and higher quality outputs of a larger model. Read more about how we used a Mixture of Experts architecture to improve Azure AI translation performance and quality.
We are also announcing a new mini model, Phi-3.5-mini. Both the new MoE model and the mini model are multi-lingual, supporting over 20 languages. The additional languages allow people to interact with the model in the language they are most comfortable using.
Even with the release of the new mini model, Phi-3.5-mini, it remains remarkably compact with just 3.8B parameters.
Organizations such as CallMiner, a leader in conversational intelligence, are choosing Phi models for their impressive speed, accuracy, and security.
“At CallMiner, we’re continuously innovating and upgrading our conversation intelligence platform. We’re thrilled by the benefits brought by Phi models to our GenAI architecture. In our review of various models, we consistently focus on accuracy, speed, and security... The compact size of Phi models contributes to their exceptional speed, and fine-tuning enables us to customize the models to crucial use cases for our clients with high accuracy and in multiple languages. Moreover, the clear training process of the Phi models allows us to address bias and integrate GenAI securely. We are eager to increase the use of Phi models throughout our product range.”—Bruce McMahon, CallMiner’s Chief Product Officer.
To enhance predictability of outputs and establish the structure needed by applications, we are incorporating Guidance into the Phi-3.5-mini serverless endpoint. Guidance, a well-regarded open-source Python library boasting over 18K GitHub stars, empowers developers to clearly specify via a single API call the exact programmatic constraints the model must adhere to for structured outputs such as JSON, Python, HTML, SQL, and more. With Guidance, costly retries are eliminated, and developers can, for instance, require the model to select from predetermined lists (like medical codes), limit outputs to direct quotations from provided contexts, or ensure compliance with particular regex patterns. Guidance meticulously directs the model at every step within the inference stack, elevating output quality while also reducing costs and latency by approximately 30-50% in scenarios requiring tight structure.
We are also updating the Phi vision model with multi-frame support. This means that Phi-3.5-vision (4.2B parameters) allows reasoning over multiple input images unlocking new scenarios like identifying differences between images.
At the core of our product strategy, Microsoft is dedicated to supporting the development of safe and responsible AI, and provides developers with a robust suite of tools and capabilities.
Developers working with Phi models can assess quality and safety using both built-in and custom metrics using Azure AI evaluations, informing necessary mitigations. Azure AI Content Safety provides built-in controls and guardrails, such as prompt shields and protected material detection. These capabilities can be applied across models, including Phi, using content filters, or can be easily integrated into applications through a single API. Once in production, developers can monitor their application for quality and safety, adversarial prompt attacks, and data integrity, making timely interventions with the help of real-time alerts.
Furthering our goal to provide developers with access to the broadest selection of models, we are excited to also announce two new open models, Jamba 1.5 Large and Jamba 1.5, available in the Azure AI model catalog. These models use the Jamba architecture, blending Mamba and Transformer layers for efficient long-context processing.
AI21 states that the Jamba 1.5 Large and Jamba 1.5 are the most sophisticated models in the Jamba series, featuring a Hybrid Mamba-Transformer architecture. This design combines Mamba layers for processing close-range data with Transformer layers for capturing extensive context, optimizing the models for industries such as finance, healthcare, life sciences, retail, and consumer packaged goods.
“We are excited to deepen our collaboration with Microsoft, bringing the state-of-the-art of the Jamba Model family to Azure AI users… As an advanced hybrid SSM-Transformer (Structured State Space Model-Transformer) set of foundation models, the Jamba model family enables easy access to efficiency, low latency, high quality, and extensive context management. These models are designed to boost enterprise performance and are seamlessly integrated with the Azure AI platform”— Pankaj Dugar, Senior Vice President and General Manager of North America at AI21
Our efforts are currently focused on enhancing RAG (Retrieval-Augmented Generation) pipelines, which integrate the full spectrum of data handling from preparation to embedding. RAG is particularly useful in generative AI for leveraging specific organizational data without the need for retraining. It incorporates various retrieval tactics to respond to queries with precision based on the existing data. Nevertheless, extensive data preparation is needed for performing vector searches.
We are proud to announce the general availability of integrated vectorization in Azure AI Search today. This new feature simplifies the data management process by consolidating various steps into a unified workflow, allowing for automatic vector indexing and searches. It effortlessly enhances your application’s performance by unlocking the potential hidden within your data.
In addition to improving developer productivity, integration vectorization enables organizations to offer turnkey RAG systems as solutions for new projects, so teams can quickly build an application specific to their datasets and need, without having to build a custom deployment each time.
Customers like SGS & Co, a global brand impact group, are streamlining their workflows with integrated vectorization.
“SGS AI Visual Search is a GenAI application built on Azure for our global production teams to more effectively find sourcing and research information pertinent to their project… The most significant advantage offered by SGS AI Visual Search is utilizing RAG, with Azure AI Search as the retrieval system, to accurately locate and retrieve relevant assets for project planning and production”—Laura Portelli, Product Manager, SGS & Co
You can now extract custom fields for unstructured documents with high accuracy by building and training a custom generative model within Document Intelligence. This new ability uses generative AI to extract user specified fields from documents across a wide variety of visual templates and document types. You can get started with as few as five training documents. While building a custom generative model, automatic labeling saves time and effort on manual annotation, results will display as grounded where applicable, and confidence scores are available to quickly filter high quality extracted data for downstream processing and lower manual review time.
Today we are excited to announce that Text to Speech (TTS) Avatar, a capability of Azure AI Speech service, is now generally available. This service brings natural-sounding voices and photorealistic avatars to life, across diverse languages and voices, enhancing customer engagement and overall experience. With TTS Avatar, developers can create personalized and engaging experiences for their customers and employees, while also improving efficiency and providing innovative solutions.
The TTS Avatar service provides developers with a variety of pre-built avatars, featuring a diverse portfolio of natural-sounding voices, as well as an option to create custom synthetic voices using Azure Custom Neural Voice. Additionally, the photorealistic avatars can be customized to match a company’s branding. For example, Fujifilm is using TTS Avatar with NURA, the world’s first AI-powered health screening center.
“Embracing the Azure TTS Avatar at NURA as our 24-hour AI assistant marks a pivotal step in healthcare innovation. At NURA, we envision a future where AI-powered assistants redefine customer interactions, brand management, and healthcare delivery. Working with Microsoft, we’re honored to pioneer the next generation of digital experiences, revolutionizing how businesses connect with customers and elevate brand experiences, paving the way for a new era of personalized care and engagement. Let’s bring more smiles together”—Dr. Kasim, Executive Director and Chief Operating Officer, Nura AI Health Screening
As we bring this technology to market, ensuring responsible use and development of AI remains our top priority. Custom Text to Speech Avatar is a limited access service in which we have integrated safety and security features. For example, the system embeds invisible watermarks in avatar outputs. These watermarks allow approved users to verify if a video has been created using Azure AI Speech’s avatar feature. Additionally, we provide guidelines for TTS avatar’s responsible use, including measures to promote transparency in user interactions, identify and mitigate potential bias or harmful synthetic content, and how to integrate with Azure AI Content Safety. In this transparency note, we describe the technology and capabilities for TTS Avatar, its approved use cases, considerations when choosing use cases, its limitations, fairness considerations and best practice for improving system performance. We also require all developers and content creators to apply for access and comply with our code of conduct when using TTS Avatar features including prebuilt and custom avatars.
We’re excited to share that the VS Code extension for Azure Machine Learning is now generally available. This extension enables you to build, train, deploy, debug, and manage machine learning models using Azure Machine Learning directly from your preferred VS Code setup, on desktop or web. It includes features such as VNET support, IntelliSense, and integration with Azure Machine Learning CLI, making it suitable for production use. Learn more about the extension by reading this tech community blog.
Companies such as Fashable have already implemented this in their production environment.
“The VS Code extension for Azure Machine Learning has substantially improved our workflow since its preview release. Being able to handle everything from building to deploying models directly within VS Code has transformed our operations. The comprehensive integration and powerful features like interactive debugging and VNET support have boosted our efficiency and teamwork. We are excited about its general availability and are eager to utilize its full capabilities in our AI endeavors.”—Ornaldo Ribas Fernandes, Co-founder and CEO, Fashable
We are also pleased to announce the general availability of the Conversational PII Detection Service in Azure AI Language, which enhances Azure AI’s functionality for identifying and redacting sensitive information in conversations that use the English language. Tailored to improve data privacy and security for developers creating generative AI apps, this service builds upon the Text PII redaction service, helping users detect and redact sensitive details like phone numbers and email addresses in unstructured text from various sources, including meeting transcripts and calls.
We recently announced updates to Azure OpenAI Service, including the capability to manage Azure OpenAI Service quota deployments independently without requiring support from your account team. This change allows for more flexible and efficient requests for Provisioned Throughput Units (PTUs). Additionally, we introduced OpenAI’s latest model on 8/7, featuring Structured Outputs like JSON Schemas for the new GPT-4o and GPT-4o mini models. These structured outputs are incredibly beneficial for developers who require validated and formatted AI outputs into structures such as JSON Schemas.
We continue to invest in the Azure AI stack to deliver cutting-edge innovation to our customers. This empowers you to build, deploy, and scale your AI solutions with safety and confidence. We are excited to see the innovations you will bring next.
Welcome to DediRock, your trusted partner in high-performance hosting solutions. At DediRock, we specialize in providing dedicated servers, VPS hosting, and cloud services tailored to meet the unique needs of businesses and individuals alike. Our mission is to deliver reliable, scalable, and secure hosting solutions that empower our clients to achieve their digital goals. With a commitment to exceptional customer support, cutting-edge technology, and robust infrastructure, DediRock stands out as a leader in the hosting industry. Join us and experience the difference that dedicated service and unwavering reliability can make for your online presence. Launch our website.