Ask Astra

How is my system uptime this week?

Real availability, weighted by what actually mattered.

Business insightProductWeekly Monday + immediate alert on any 5+ min critical-service outage.

Try this with Astra See all things to ask

Free to startNo credit card requiredUpdated Apr 2026

You'd think this needs a status page review and three Datadog dashboards — Astra has the weekly health summary plus the one incident that actually cost you money.

The short answer

Astra answers "how is my system uptime this week" by pulling availability metrics from Vercel, Cloud Run, Datadog synthetic checks, and Sentry error rates, then computing real uptime per critical service over the trailing 7 days. She doesn't just report a single nine-count — she breaks it down by surface (marketing site, checkout, API, dashboard) and weights each incident by what it touched. A 12-minute Cloud Run blip during European business hours that hit your signup endpoint matters more than a 2am marketing-site hiccup. The output is a Lark message: per-service uptime percentage, total downtime in minutes, the worst incident with root cause from Sentry, and an estimate of users affected pulled from PostHog session counts during the window. You stop chasing alert noise and focus on the one outage that actually mattered.

How Astra actually does it

1
Pull availability per service
Vercel deployment health, Cloud Run request success rate, Datadog synthetic uptime checks, Sentry error rate spikes — last 7 days.
VercelDatadogSentry
2
Weight by surface criticality
Checkout > API > dashboard > marketing. Same downtime gets a different severity depending on what broke.
3
Cross-reference with PostHog traffic
Match incident windows to PostHog session counts to estimate users actually affected, not theoretical worst case.
PostHog
4
Identify the worst incident
Pull Sentry stack trace and Vercel deploy log around the incident window, surface root cause and which deploy introduced it.
5
Send the weekly digest
Lark message: per-service uptime, total downtime, worst incident with root cause, users affected, fix status.
Lark

What it looks like in chat

How is system uptime this week?

Astra
Trailing 7 days:

- **Marketing site (Vercel):** 99.98% — 2 min cold-start blip Tuesday 3am. No users affected.
- **API (Cloud Run):** 99.81% — 16 min total downtime. Two incidents.
- **Dashboard:** 99.94% — 5 min Sentry error spike Thursday 9pm.
- **Checkout:** 100%.

**Worst incident:** Wednesday 14:32-14:48 UTC. Cloud Run API returned 503 on `/v1/run` for ~16 min. Root cause: deploy `a3f9c2` removed an env var without a fallback. PostHog shows 84 active sessions in that window, ~23 failed requests. Auto-rolled-back at 14:48, fix deployed 15:02.

**Action:** Add a startup check for required env vars to your Dockerfile (~30 min). I can file the task and assign to your developer agent. Otherwise this exact failure repeats next time someone removes a config key.

What you get back

Lark digest every Monday with per-service uptime, total downtime in minutes, the single worst incident with root cause and users affected, plus a concrete fix recommendation.

Cadence

Weekly Monday + immediate alert on any 5+ min critical-service outage.

Ask Astra this right now

We'll spin up your workspace, hand the prompt to Astra, and you see the answer in 60 seconds. Free.

Try this with Astra

FAQ

Frequently asked questions

Clear answers about wallet credit, usage, subscriptions, and how Tycoon charges for work.

What if I only have Vercel, no Datadog or Sentry?

Astra works with what you have. Vercel alone gives deployment health and edge availability. Adding Sentry sharpens root cause detection; Datadog adds synthetic uptime from outside your infra. Most users start with Vercel + Sentry and add Datadog later.

Does she alert me in real time during an outage?

Yes — any critical-surface outage (checkout, API) lasting >5 min triggers an immediate Lark ping with the Sentry link and the suspected deploy. The Monday digest is the retrospective view.

Can she actually roll back a bad deploy?

She drafts the rollback and asks for one-tap confirmation. On approval she runs `vercel rollback` or the equivalent Cloud Run revision swap, then posts confirmation. You stay in the loop without typing the command.

What counts as a 'critical' surface?

By default: anything in your checkout flow, your authenticated API, and your signup page. You can override — say "the docs site is critical to me" and she'll bump it to top priority for alerts and weighting.

Run your one-person company.

Hire your AI team in 30 seconds. Start for free.

Free to start · No credit card required · Set up in 30 seconds