Pitch Detection with Go

A minimal application to print out the incoming pitch from the microphone

Over the weekend I got excited about the idea of building a singing practice app where folks can sing along to music they own and the app will allow them to sing along and score them on pitch accuracy.

This feels ambitious, so I want to celebrate every small win.

The first small win - pitch detection!

The basic elements are as follow -

Here's the repository for reference

I used ChatGPT to outline and build what I needed.

  1. The audio streaming element that captures incoming data from the microphone .
1package main 2 3import ( 4 "github.com/gordonklaus/portaudio" 5) 6 7// initAudio initializes an audio stream to capture audio from the microphone. 8func initAudio() (*portaudio.Stream, error) { 9 err := portaudio.Initialize() 10 if err != nil { 11 return nil, err 12 } 13 14 // Open the default audio device with a buffer of size 2048 15 stream, err := portaudio.OpenDefaultStream(1, 0, 44100, len(buffer), &buffer) 16 if err != nil { 17 return nil, err 18 } 19 return stream, nil 20}
go
  1. the element that processes the audio using fft to read in the stream from the microphone and determine its dominant frequency.
1package main 2 3import ( 4 "math/cmplx" 5 "gonum.org/v1/gonum/dsp/fourier" 6) 7 8func processAudio(in []float32) float64 { 9 // Convert float32 to float64 for FFT 10 data := make([]float64, len(in)) 11 for i, v := range in { 12 data[i] = float64(v) 13 } 14 15 // Create an FFT plan 16 fft := fourier.NewFFT(len(data)) 17 // This performs the FFT and returns complex coefficients 18 coeff := fft.Coefficients(nil, data) 19 20 // Find dominant frequency 21 return findDominantFrequency(coeff) 22} 23 24func findDominantFrequency(coeff []complex128) float64 { 25 maxVal := 0.0 26 var maxIdx int 27 for i, v := range coeff { 28 if abs := cmplx.Abs(v); abs > maxVal { 29 maxVal = abs 30 maxIdx = i 31 } 32 } 33 sampleRate := 44100 // Define as per your setup 34 // Calculate frequency 35 return float64(maxIdx) * float64(sampleRate) / float64(len(coeff)) 36} 37
go
  1. And then the main() function to make'em kith -
1package main 2 3import ( 4 "fmt" 5 "log" 6) 7 8var buffer = make([]float32, 2048) // Buffer size must be appropriate for your use case 9 10func main() { 11 stream, err := initAudio() 12 if err != nil { 13 log.Fatalf("Error initializing audio: %v", err) 14 } 15 defer stream.Close() 16 17 err = stream.Start() 18 if err != nil { 19 log.Fatalf("Error starting audio stream: %v", err) 20 } 21 defer stream.Stop() 22 23 for { 24 err = stream.Read() 25 if err != nil { 26 log.Printf("Error reading audio: %v", err) 27 continue 28 } 29 30 pitch := processAudio(buffer) // Pass the buffer directly 31 fmt.Printf("Detected pitch: %f Hz\n", pitch) 32 } 33} 34
go

And here's a screenshot of the output - pitch detection console output 688 is a little sharp of E5

Questions -

  • How do I mock the microphone input?
    • How can I get Ableton Live to port-out to this application? And how can I ensure I can still hear the audio?
  • When I'm not actively singing into the mic, the background noise throws off wild pitch values. How can I gate the incoming audio?
  • How should I architect the repository for greatest developer experience and velocity?