Software Architect / Microsoft MVP (AI) and Technical Author

AI, Audio Notes, Azure, Azure AI Services, Speech, Startups, Text Analytics

Audio Notes is Live!

For the last few weeks, I have built the Audio Notes Micro-SaaS MVP in public.


I’ve worked on this when I had spare time and estimate it took between 40 and 60 hours to create the entire solution.


The AI centric solution use speech to text and document summarization to help you automatically generate concise notes from the spoken word.


A massive time saver.


Powering the solution:

  • Azure AI services, specifically Azure AI speech services and Azure AI language services.
  • Code is written using a combination of C# and JavaScript.
  • Microsoft .NET 8.
  • Azure App Service to host the web application
  • Azure SQL for data storage
  • Entity Framework being for database access.


Previous Posts

All previous posts are available here.  Each post will show you how the idea was conceived and how each component that forms the solution was built.


Architecture and key processes are also detailed.


  1. Introduction
  2. Using Azure AI Speech to Perform Continual Speech to Text
  3. Transcription Using Azure AI Language to Perform Document Summarization
  4. Blending Azure AI Speech and Azure Language to Create a Micro SaaS
  5. Creating an Interface to Browse Content
  6. Audio Notes: Creating an Interface to Record Content


Each post will show you the evolution of each screen, key capabilities, how technical challenges were overcome.  Mock-ups and UML sequence diagrams depict how the solution looks and how processes interact with each other.


How to Access

First, here is how to access Audio Notes:

  1. Browse to
  2. Register with an email address.
  3. Activate account using the email you receive and login at


Check spam for the activation email.  This will be the only email you receive from the application for now.


How to Use

Use Audio Notes to automatically transcribe speech to text in real-time.  Key points can be automatically extracted from the transcription and summarised in a single click.


To record, transcribe, and summarize an audio note, perform the follow steps:


1. Click Record Note.


2. Supply a title:


3. Click Start Recognition and speak.


4. Speech will be transcribed in real-time. You may be prompted to grant the application to your microphone.


5. Click Stop Recognition when complete.


6. Click Summarise Transcript to get a concise summary from the transcript.



Some examples of Audio Notes in action follow.

Transcription from Bloomberg with Satya Nadella

Here you can see a recent transcription from Bloomberg Live via YouTube with Satya Nadella discussing AI and Tech in 2024:

Transcription of an Interview with DJ Carl Cox

In this image, you can see AI has auto transcribed an interview with Carl Cox.



In this demo, you will see a note being recorded in real-time. I play the YouTube video from my cell phone and the laptop speaker transcribes the audio in real-time.

The transcription is the summarised.


Use Cases

Sample use cases for this include, but are not limited to:


  • Meeting Recordings: At the start of meetings, a designated device records the audio of discussions.
  • Document Creation: Transcribed text can be formatted into a document, organizing content by speaker or topic for clarity.
  • Summary Generation: Use the summary to highlight the most relevant content in a discussion.
  • Action Items Tracking: Any action items or tasks assigned during a meeting can be compiled into a separate list for tracking and follow-up in subsequent meetings.



Some benefits include:


  • Time Savings: Eliminates the need for manual notetaking during meetings.
  • Accuracy: Reduces the risk of errors or omissions that may occur with traditional note-taking methods.
  • Accessibility: Provides accessible meeting summaries for individuals who may have difficulty attending meetings in person.
  • Knowledge Retention: Ensures that important decisions, discussions, and action items are documented and easily retrievable for future reference.


Next Steps

I’m interested in some of the batch transcription capabilities in Azure AI Speech services.

You can use this to ingest audio at a URL to perform transcription on longer running audio files or memory streams.

Podcast audio files and URLs can be supplied and processed.  A transcription will be generated.

You can then run analytics over the content being discussed or get a summary of the entire podcast in just a few minutes -without having to listen to the entire podcast.


Open AI and Further Thoughts

Additionally, Open AI recently shipped world-level timestamps with their Whisper Audio API.  It gives you word-level precisions for transcripts in a video.

I have many ideas for this capability.

Give Audio Notes a try. It’s in MVP stage and I’m looking for feedback.  Thanks.

Get the latest content and code from the blog posts!
I respect your privacy. No spam. Ever.

Leave a Reply