Free Ebook cover Astronomy Through Data: Measuring the Universe with Light, Time, and Motion

Astronomy Through Data: Measuring the Universe with Light, Time, and Motion

New course

12 pages

Light Curves as Signals: Variability, Noise, and Period Finding

Capítulo 6

Estimated reading time: 0 minutes

+ Exercise

Light curves as time-series signals

A light curve is a record of brightness versus time. In data terms, it is a sampled signal: you observe a source at discrete times, each observation returns a flux (or magnitude), and the sequence encodes physical variability plus measurement and environmental effects. Treating a light curve as a signal is powerful because it gives you a toolbox: filtering, detrending, frequency analysis, and model comparison.

In astronomy, variability can be intrinsic (the object changes) or extrinsic (something blocks or magnifies it). Examples include pulsating stars, eclipsing binaries, rotating spotted stars, transiting exoplanets, accreting compact objects, and microlensing. The same signal-processing questions appear across these cases: What is the baseline? What is noise versus real change? Is there a period? If so, how stable is it? Are there multiple periods? Are there transient events?

Unlike many engineered signals, astronomical light curves are often irregularly sampled (gaps due to day/night, weather, scheduling), have heteroscedastic uncertainties (errors change with brightness and conditions), and contain systematics (instrumental or atmospheric trends). Period finding methods must handle these realities.

Key representations: flux, magnitude, and relative units

Period searches and variability metrics can be performed in flux or magnitude, but the choice matters. Flux is additive and often closer to Gaussian noise for photon-limited measurements. Magnitude compresses dynamic range and can make symmetric flux variations look asymmetric. For small fractional variations, both behave similarly, but for deeper eclipses or transits, flux is usually easier to model physically. A common approach is to normalize flux to a relative scale (e.g., divide by the median) so the baseline is near 1, then analyze fractional changes.

When combining data from different nights or instruments, you may need per-segment offsets (different zero points) or scaling. In signal terms, those are low-frequency components or step functions that can masquerade as long periods if not handled.

Continue in our app.

You can listen to the audiobook with the screen off, receive a free certificate for this course, and also have access to 5,000 other free online courses.

Or continue reading below...
Download App

Download the app

Variability: what you are trying to detect

Periodic, quasi-periodic, and aperiodic behavior

Many sources are approximately periodic: pulsators repeat with a stable period; eclipsing binaries repeat with orbital period; rotating stars repeat with rotation period. But “periodic” in real data often means “periodic plus complications.” Starspots evolve, changing amplitude and phase; accretion flickering adds noise; pulsators can have multiple modes; binaries can show ellipsoidal variations plus eclipses.

Quasi-periodic signals have a characteristic timescale but not a perfectly stable phase. In frequency space, they appear as broadened peaks rather than sharp lines. Aperiodic variability includes flares, stochastic flickering, and red-noise processes where low frequencies dominate.

Signal shapes matter

Period finding is not only about frequency; it is also about waveform. A sinusoid is easy: a single frequency describes it. Eclipses and transits are not sinusoidal: they are narrow dips with flat out-of-eclipse regions. A method optimized for sinusoids can still find the correct period, but it may be less sensitive or may favor harmonics (e.g., half the true period) because a dip repeats twice per orbit in some systems or because the Fourier representation needs multiple harmonics.

Before choosing a period method, ask: is the expected signal sinusoidal (pulsation), box-like (transit), sawtooth-like (some pulsators), or a mix (binary with reflection + eclipse)?

Noise and systematics: what can fool you

Random noise sources

Random noise includes photon counting noise, background noise, read noise, and scintillation. In time-series terms, this is often approximated as “white noise” (uncorrelated from point to point), but real light curves frequently have time-correlated noise (“red noise”) from atmosphere, guiding, focus drift, or detector temperature changes. Correlated noise is especially dangerous for transit-like signals because it can create dip-like structures that appear coherent over tens of minutes to hours.

Systematics and trends

Systematics are repeatable patterns not caused by the target. Common examples: airmass trends (object gets dimmer as it approaches the horizon), seeing variations affecting aperture photometry, differential extinction due to color differences between target and comparison stars, and instrument drifts. These often appear as low-frequency variations. If you search for long periods without removing trends, you may “discover” the observing schedule or atmospheric behavior instead of astrophysics.

Sampling effects: cadence, gaps, and aliasing

Sampling defines what periods you can detect. If your cadence is 10 minutes, you cannot resolve a 2-minute oscillation. If you observe only at night, you imprint a ~1-day window function that creates aliases: false peaks at frequencies offset by integer multiples of 1 cycle/day. Seasonal gaps can create yearly aliases. Irregular sampling helps reduce some aliasing but introduces its own complexity.

A practical rule: always examine the spectral window (the periodogram of a constant signal sampled at your observation times). Peaks in the window indicate frequencies where aliases are likely. If a candidate period is close to a strong window peak or its combinations, be cautious.

Preprocessing a light curve for period analysis

Period finding is most reliable when the input series is cleaned in a controlled way. The goal is not to “force” periodicity but to remove obvious non-astrophysical artifacts while preserving real variability.

Step-by-step: prepare the time series

  • 1) Assemble arrays: time t, flux (or magnitude) y, and uncertainty sigma. Keep times in a consistent system (e.g., barycentric time if available) because timing errors can smear short periods.
  • 2) Remove invalid points: drop NaNs, saturated points, and measurements flagged as bad. Keep a record of what you removed.
  • 3) Outlier handling: use a robust method such as median absolute deviation (MAD) on residuals from a smooth trend, or sigma-clipping within short time windows. Avoid clipping real flares or eclipses unless you are explicitly focusing on periodic baseline behavior.
  • 4) Normalize: convert to relative flux (divide by median) or subtract the median magnitude. This centers the series and makes amplitudes comparable across segments.
  • 5) Detrend low-frequency systematics: if you expect a short period (hours) and see a slow drift (nightly trend), remove it with a low-order polynomial per night, a spline, or a running median filter with a window much longer than the expected period. Choose the window carefully: too short and you will erase the signal.
  • 6) Preserve uncertainties: if you transform flux (e.g., normalize), propagate uncertainties accordingly. Weighted periodograms rely on correct relative weights.
  • 7) Inspect: plot the cleaned light curve and verify that the preprocessing did not introduce periodic artifacts (e.g., edge effects from filtering).

A useful habit is to keep multiple versions: raw, cleaned, detrended, and “analysis-ready.” Period results should be checked against the raw data to ensure the signal is not a preprocessing artifact.

Quantifying variability before searching for periods

Not every light curve is meaningfully variable. Before running periodograms, compute quick variability metrics to decide whether a period search is warranted and to compare objects.

Practical metrics

  • RMS or standard deviation: simple but sensitive to outliers and trends.
  • Robust scatter: MAD scaled to match standard deviation for Gaussian noise; less sensitive to outliers.
  • Peak-to-peak amplitude: informative for eclipses but very sensitive to outliers; use percentiles (e.g., 5th to 95th) instead.
  • Stetson indices (if multi-band or paired observations): detect correlated variability across bands.
  • Autocorrelation: reveals characteristic timescales; periodic signals show repeating peaks.

If the scatter is consistent with the reported uncertainties and no structure is visible, a period search may return spurious peaks dominated by the sampling window.

Period finding methods

Lomb–Scargle periodogram (LS)

The Lomb–Scargle periodogram is a standard tool for unevenly sampled data. Conceptually, it fits a sinusoid at each trial frequency and measures how much variance is explained. It is well-suited for approximately sinusoidal signals and can incorporate measurement uncertainties via weights.

Important practical points: LS assumes the signal can be approximated by a sinusoid (or a sum of sinusoids if you use a multi-harmonic variant). For eclipse-like signals, LS may detect harmonics strongly. Multi-harmonic LS can improve sensitivity by fitting several harmonics, capturing sharper features.

Box Least Squares (BLS)

BLS is designed for transit and eclipse signals: repeating box-shaped dips. It searches over trial periods and durations, folding the light curve and evaluating how well a box model fits. BLS is typically more sensitive than LS for narrow dips because it matches the expected shape.

BLS outputs not only a period but also an estimated transit epoch, duration, and depth. It is widely used for exoplanet transit searches and also works well for detached eclipsing binaries with relatively flat out-of-eclipse regions.

Phase Dispersion Minimization (PDM) and string-length methods

PDM folds the data on a trial period and measures how scattered the phased light curve is within phase bins. The best period minimizes dispersion. PDM does not assume a sinusoidal shape, making it useful for non-sinusoidal periodic variables. String-length methods similarly evaluate how “smooth” the phased curve is by connecting points in phase order; the shortest string indicates the best period.

Autocorrelation and time-domain approaches

Autocorrelation can detect periodicity by measuring similarity of the light curve with itself at different lags. It is intuitive and can be robust for quasi-periodic signals, but it can be confused by trends and irregular sampling. Some approaches interpolate to a regular grid first, which can introduce artifacts; use with care.

Step-by-step: a practical period search workflow

This workflow is designed to be repeatable and to produce diagnostics that help you trust (or reject) a candidate period.

1) Define the search range

Choose a minimum and maximum period based on cadence and baseline. The minimum period should be several times the typical sampling interval; the maximum period should be less than (or at most comparable to) the total time span, because you need multiple cycles to confirm periodicity.

  • Minimum period: often ~2–5 times the median cadence.
  • Maximum period: often ~1/2 to 1/3 of the total baseline for robust detection, though longer periods can be suggested with fewer cycles if the signal is strong.

2) Choose frequency grid and resolution

Periodograms evaluate discrete trial frequencies. If the grid is too coarse, you miss the peak; if too fine, you waste computation and increase the number of trials (affecting false-alarm probabilities). A common approach is to set the frequency step based on the total baseline T, because peak widths scale like ~1/T. Oversampling by a factor (e.g., 5–10) helps locate peaks precisely.

# Pseudocode for frequency grid (conceptual) T = max(t) - min(t) f_min = 1 / P_max f_max = 1 / P_min df = 1 / (oversample * T) frequencies = arange(f_min, f_max, df)

3) Compute the periodogram (LS or BLS)

Use LS for near-sinusoidal variability; use BLS for transit/eclipse-like dips. If uncertain, run both and compare. For LS, use a weighted version if uncertainties vary significantly. For BLS, scan a range of durations appropriate to your cadence and expected event length.

4) Identify candidate peaks and check aliases

Take the top few peaks, not just the highest. For each candidate period, check whether it is close to common aliases (1 day, 1/2 day, etc.) or whether it can be expressed as an alias of another peak. If you see a family of peaks separated by 1 cycle/day, the true period may be one of them, and additional data or physical plausibility checks are needed.

5) Fold the light curve and inspect the phased shape

Folding converts time to phase: phase = (t - t0) / P mod 1. Choose t0 as a reference epoch (often the time of minimum light for eclipses or maximum for pulsators). Plot phase versus normalized flux and repeat phase from 0 to 2 for clarity.

What to look for:

  • Coherence: points line up into a stable curve rather than a cloud.
  • Scatter pattern: increased scatter at certain phases can indicate systematics or real changes (e.g., spot evolution).
  • Harmonic confusion: if the folded curve shows two similar dips per cycle, you may be at half the true period (common in eclipsing binaries).
  • Phase gaps: if large parts of phase are unobserved, the period may be poorly constrained.

6) Refine the period

Once you have a candidate, refine it by zooming in around the peak with a finer frequency grid, or by fitting a model directly in time domain. Even a simple sinusoid fit can refine the period and provide uncertainties. For eclipse/transit signals, refine using a box model or a more physical shape if available.

7) Quantify significance and uncertainty

Periodogram peaks can occur by chance, especially with many trial frequencies. Use a false-alarm probability (FAP) estimate when available (common for LS). For more robust assessment, use bootstrap or permutation tests: shuffle the flux values relative to times (destroying periodicity but preserving sampling) and recompute the maximum peak; repeat many times to estimate how often noise produces a peak as strong as observed.

Uncertainty in period can be estimated from the width of the peak (related to baseline) and from model fitting. A practical approach is to compute the period that maximizes the statistic and then find the range where the statistic drops by a chosen amount, or to use Monte Carlo resampling with noise consistent with uncertainties.

Common pitfalls and how to diagnose them

1-day aliases and observing cadence

If you observe nightly, the window function often creates peaks near 1 day and its harmonics. A candidate period of 0.999 days is suspicious unless the phased curve is exceptionally coherent and physically plausible. Compare candidate frequencies f with f_true ± n * 1/day. If multiple peaks fit equally well, additional observations at different times of day (or from different longitudes) can break the degeneracy.

Harmonics and subharmonics

Non-sinusoidal signals produce harmonics: if the true period is P, LS may show strong power at P/2, P/3, etc. Conversely, sometimes the strongest peak is at 2P if the waveform has alternating depths. Always inspect folded curves at P, 2P, and P/2 for eclipse-like variables.

Trends masquerading as long periods

A slow drift across the observing run can produce a periodogram peak at very long periods. If the best period is comparable to the baseline, check whether detrending changes the result. If removing a low-order trend eliminates the peak, the “period” was likely not astrophysical.

Red noise inflating significance

Many significance formulas assume white noise. If your residuals are correlated, peaks can look more significant than they are. A diagnostic is to examine residual autocorrelation after subtracting a model. If strong correlations remain, consider using time-correlated noise models or conservative bootstrap methods that preserve correlation structure (e.g., block bootstrap).

Worked practical example (conceptual): choosing LS vs BLS

Imagine two light curves with the same sampling: one shows smooth oscillations; the other shows mostly flat flux with occasional narrow dips.

  • Smooth oscillations: Start with LS. Compute the periodogram over a range (say 0.05–10 days). Identify the top peak, fold, and check for a sinusoidal phased curve. If the curve is slightly non-sinusoidal, try multi-harmonic LS to improve the fit and refine the period.
  • Narrow dips: Start with BLS. Choose trial durations from, for example, 0.5% to 10% of the period (bounded by cadence). Compute the BLS spectrum, take the best candidate, fold, and verify that dips align at a consistent phase. Then check 2P to see whether alternating depths suggest an eclipsing binary rather than a single transit.

In both cases, after selecting a period, compute residuals (data minus model) and re-run a period search on residuals to look for additional periods (multi-mode pulsators, spot modulation plus orbital effects). This iterative approach is common in real analyses.

Beyond a single period: multi-periodicity and evolving signals

Prewhitening (iterative subtraction)

If a light curve contains multiple sinusoidal components, you can identify the strongest period with LS, fit a sinusoid, subtract it (prewhiten), and then recompute the periodogram on residuals. Repeat to find additional frequencies. This works best when components are stable and approximately sinusoidal.

Time-dependent periods and amplitude changes

Some signals drift in period or phase. A single global period may blur the folded curve. Diagnostics include:

  • O–C diagrams: measure times of maxima/minima per cycle and compare to a constant-period prediction; systematic deviations indicate period changes.
  • Sliding-window periodograms: compute periodograms in overlapping time windows to see how dominant frequencies evolve.
  • Wavelet-like approaches: represent power as a function of time and frequency, useful for quasi-periodic behavior.

These methods treat the light curve as a non-stationary signal, where the “rules” can change over time.

Practical checklist for trustworthy period results

  • Data sanity: no obvious artifacts, correct time stamps, and uncertainties make sense.
  • Preprocessing transparency: detrending choices documented and tested for sensitivity.
  • Window awareness: candidate periods checked against sampling aliases.
  • Phase coherence: folded light curve shows a stable pattern with reasonable scatter.
  • Model residuals: residuals do not show the same period strongly; if they do, the model is incomplete.
  • Alternative methods: LS vs PDM vs BLS comparison for consistency with expected waveform.
  • Significance: FAP or bootstrap assessment, especially when peaks are modest.

Now answer the exercise about the content:

A light curve shows mostly flat brightness with repeating narrow dips. Which period-finding approach is most appropriate and why?

You are right! Congratulations, now go to the next page

You missed! Try again.

BLS matches the expected waveform for transit or eclipse signals by fitting repeating box-shaped dips, making it more sensitive than sinusoid-based methods when events are narrow.

Next chapter

Detecting Exoplanets With Transits: Radius Estimates From Real Light Curves

Arrow Right Icon
Download the app to earn free Certification and listen to the courses in the background, even with the screen off.