Perception aligned terminal-based audio spectrum visualizer.
compressed-lookas-2.mp4
compressed-lookas-1.mp4
compressed-interstellar.mp4
What
Lookas captures microphone input, system audio, or both, converts the signal into frequency bands using a mel-scale FFT, and renders it as smooth, physics-driven bars directly in your terminal.
How
Audio is captured from the microphone, system loopback, or both. The signal is windowed with a Hann function to reduce spectral leakage, then transformed via FFT into frequency bins. These bins are remapped onto a mel-scale filterbank so the visualization aligns with human loudness perception rather than linear frequency spacing.
Frequency balance is handled by A-weighting , which models the human ear's actual sensitivity curve across frequency. This keeps the display proportional to what you hear rather than what a meter would measure.
Dynamic range is managed continuously using percentile tracking instead of fixed scaling. A noise gate suppresses background hiss.
The spectrum uses asymmetric smoothing: energy attacks instantly so every transient is captured at full resolution, while the decay tail uses an exponential moving average for a smooth release.
Animation is driven by a spring-damper model rather than raw amplitude changes. Energy diffuses laterally between neighboring bands, which produces a fluid motion instead of twitchiness.
Rendering uses dense Unicode block characters to achieve smooth gradients without expensive redraws. The terminal is only cleared once per frame, layout is recomputed only when geometry changes, and output is written in large contiguous chunks to avoid flicker.
On modern Linux, this yields a stable 60+ FPS experience with audio-to-visual latency low enough to feel immediate.
You're probably used to CAVA and similar tools.
How both differ?
| Feature | CAVA | Lookas |
|---|---|---|
| Mapping | Logarithmic | Mel-scale |
| Dynamics | Autosens / Manual | Percentile tracking |
| Noise | Basic reduction | Explicit noise gate |
| Balance | Per-bar EQ | A-weighting |
| Smoothing | Quadratic gravity | Asymmetric EMA |
| Physics | Gravity fall-off | Spring-damper system |
| Interaction | None | Lateral energy diffusion |
Both Lookas ( ← ) and CAVA ( → ) are running on default configs
f.mp4
Simply run:
cargo install lookasYou usually don't need to install anything else.
If your system already has working audio (microphone or system sound),
Lookas will just run.
Important
On very minimal Linux installs, you might be missing a couple of audio packages. If Lookas fails to start or can't capture audio, run:
sudo apt install -y libasound2-dev pulseaudio-utilsThis works for both PulseAudio and PipeWire systems (via pipewire-pulse).
Lookas is zero-config by default.
Just run:
lookasIt will attempt to start with system audio.
If system audio isn't available, it automatically falls back to microphone input.
1– Microphone input2– System audio (loopback / monitor)3– Microphone + system mixr– Restart audio pipelineq– Quit
You do not need any configs to use this.
It ships with sensible defaults and works out of the box.
Configuration exists purely for customization.
If you want to tweak how it looks or reacts to sound, it can read a single TOML file and then apply environment variables on top.
Precedence is:
Environment variables > config file > defaults
The default location is:
~/.config/lookas.toml
You can also point to a file explicitly:
LOOKAS_CONFIG=/path/to/lookas.toml lookasTo create the config file, simply copy-paste this into your terminal
mkdir -p ~/.config
cat > ~/.config/lookas.toml <<'TOML'
fmin = 30.0
fmax = 16000.0
fft_size = 2048
target_fps_ms = 16
tau_spec = 0.06
gate_db = -55.0
flow_k = 0.18
spr_k = 60.0
spr_zeta = 1.0
TOMLThe fmin and fmax values determine the absolute lowest and highest frequencies rendered by the visualizer, measured in Hertz.
The fmin sets the bass cutoff, mapping to the lowest perceptible rumbles, with a practical minimum of around 20.0 and a maximum depending on your desired low-end focus. Setting it to 30.0 ignores inaudible sub-bass mud.
The fmax sets the treble ceiling, dictating the highest pitch the visualizer will display.
While human hearing maxes out around 20000.0, most meaningful audio content tapers off sooner, making 16000.0 a clean default.
Warning
Pushing fmax too high will result in empty bars on the right side of the spectrum if your audio source lacks high-frequency energy.
The fft_size dictates the number of samples per Fast Fourier Transform window. This process relies on the canonical algorithm detailed in the Cooley & Tukey (1965) paper. This value must strictly be a power of two. The absolute minimum is 512, which gives an incredibly fast response time but terribly blurred frequency detail.
The maximum practical value is 8192, yielding extreme detail at the cost of high CPU utilization and noticeable input latency. The default of 2048 sits at the optimal intersection of performance, minimal latency, and distinct frequency separation.
The target_fps_ms value establishes the render loop's resting heartbeat, defining the target time per frame in milliseconds.
Setting this to 16 forces the visualizer to update roughly 60 times a second. You can drop this down to 8 for an ultra-fluid 120 FPS experience if your terminal emulator can actually handle the throughput, or raise it to 33 to cap it at a relaxed 30 FPS, drastically cutting down on CPU overhead.
Warning
Going above 50 milliseconds will make the visualizer feel noticeably choppy and disconnected from the audio.
The tau_spec controls the time constant (τ) governing how quickly the bars decay after a transient (read the underlying math here.)
Attacks are always instant regardless of this value. tau_spec only affects the release: how long the energy lingers before fading. A minimal value close to 0.01 makes the decay almost immediate, while a value approaching 0.20 creates a long, heavy fade-out trail.
The default of 0.06 provides a polished decay that tracks fast transients without inducing visual fatigue.
The gate_db defines the absolute silence threshold in decibels. This dictates the point at which background noise is entirely suppressed. If your environment or microphone has a persistent hiss, you would raise this to a less negative number, like -30.0, which enforces strict silence and only allows distinctly loud audio to register.
If quiet nuances in your music are failing to appear on the visualizer, you drop this to a more negative extreme, something like -80.0, making the application highly sensitive. The 55.0 default is tuned to reject standard line noise while capturing quiet room ambiance.
The flow_k parameter governs the lateral diffusion of energy, bleeding a calculated fraction of one bar's amplitude into its immediate neighbors.
It's restricted between 0.0 and 1.0.
Setting this to the absolute minimum of 0.0 severs all connections, meaning every single frequency band moves independently like a raw digital meter.
Approaching the maximum of 1.0 causes so much bleed that the visualizer becomes an undefined, solid block of moving color.
The 0.18 default injects just enough organic cohesion so the bars move together like a continuous fluid wave.
The spr_k and spr_zeta variables completely dictate the physical weight and momentum of the bars using a second-order system model.
The spr_k represents spring stiffness, ranging from a lethargic minimum of 10.0 to a violently rigid maximum of 200.0, determining the raw force pulling the bar to its target height.
The spr_zeta represents the damping ratio (ζ) (for a deep dive, read this.)
A zeta value below 1.0 creates an underdamped system, causing the bars to bounce past their target before settling. A zeta value exactly at 1.0, the default, achieves critical damping, meaning the bar snaps to the target instantly with zero overshoot. Any value above 1.0 makes the system overdamped, resulting in a heavy, delayed crawl to the peak.
Every config value can be overridden temporarily using environment variables.
Examples:
LOOKAS_FFT_SIZE=4096 lookas
LOOKAS_GATE_DB=-60 lookas
LOOKAS_TARGET_FPS_MS=33 lookasPlease read this
MIT © @rccyx