Pitch Detection with Go

A minimal application to print out the incoming pitch from the microphone

Tuesday, May 14, 2024

Over the weekend I got excited about the idea of building a singing practice app where folks can sing along to music they own and the app will allow them to sing along and score them on pitch accuracy.

This feels ambitious, so I want to celebrate every small win.

The first small win - pitch detection!

The basic elements are as follow -

Here's the repository for reference

I used ChatGPT to outline and build what I needed.

The audio streaming element that captures incoming data from the microphone .

1package main
2
3import (
4	"github.com/gordonklaus/portaudio"
5)
6
7// initAudio initializes an audio stream to capture audio from the microphone.
8func initAudio() (*portaudio.Stream, error) {
9	err := portaudio.Initialize()
10	if err != nil {
11		return nil, err
12	}
13
14	// Open the default audio device with a buffer of size 2048
15	stream, err := portaudio.OpenDefaultStream(1, 0, 44100, len(buffer), &buffer)
16	if err != nil {
17		return nil, err
18	}
19	return stream, nil
20}
go

the element that processes the audio using fft to read in the stream from the microphone and determine its dominant frequency.

1package main
2
3import (
4	"math/cmplx"
5	"gonum.org/v1/gonum/dsp/fourier"
6)
7
8func processAudio(in []float32) float64 {
9	// Convert float32 to float64 for FFT
10	data := make([]float64, len(in))
11	for i, v := range in {
12		data[i] = float64(v)
13	}
14
15	// Create an FFT plan
16	fft := fourier.NewFFT(len(data))
17	// This performs the FFT and returns complex coefficients
18	coeff := fft.Coefficients(nil, data)
19
20	// Find dominant frequency
21	return findDominantFrequency(coeff)
22}
23
24func findDominantFrequency(coeff []complex128) float64 {
25	maxVal := 0.0
26	var maxIdx int
27	for i, v := range coeff {
28		if abs := cmplx.Abs(v); abs > maxVal {
29			maxVal = abs
30			maxIdx = i
31		}
32	}
33	sampleRate := 44100 // Define as per your setup
34	// Calculate frequency
35	return float64(maxIdx) * float64(sampleRate) / float64(len(coeff))
36}
37
go

And then the main() function to make'em kith -

1package main
2
3import (
4	"fmt"
5	"log"
6)
7
8var buffer = make([]float32, 2048) // Buffer size must be appropriate for your use case
9
10func main() {
11	stream, err := initAudio()
12	if err != nil {
13		log.Fatalf("Error initializing audio: %v", err)
14	}
15	defer stream.Close()
16
17	err = stream.Start()
18	if err != nil {
19		log.Fatalf("Error starting audio stream: %v", err)
20	}
21	defer stream.Stop()
22
23	for {
24		err = stream.Read()
25		if err != nil {
26			log.Printf("Error reading audio: %v", err)
27			continue
28		}
29
30		pitch := processAudio(buffer) // Pass the buffer directly
31		fmt.Printf("Detected pitch: %f Hz\n", pitch)
32	}
33}
34
go

And here's a screenshot of the output - pitch detection console output 688 is a little sharp of E5

Questions -

How do I mock the microphone input?
- How can I get Ableton Live to port-out to this application? And how can I ensure I can still hear the audio?
When I'm not actively singing into the mic, the background noise throws off wild pitch values. How can I gate the incoming audio?
How should I architect the repository for greatest developer experience and velocity?
- https://www.calhoun.io/moving-towards-domain-driven-design-in-go/

Nate's Blog

The basic elements are as follow -

Questions -