Why is the p99 latency showing 9.0ks when all latencies are much lower?
The p99 latency showing 9.0ks when all latencies are much lower could be due to an issue in the calculation or data aggregation process. According to this GitHub issue, there have been instances where the reported statistics did not align with the expected values, potentially due to data anomalies or calculation errors.Would you like more detail on how to troubleshoot or verify the latency calculations?Sources: - this GitHub issue
馃挕聽Hint: Mention RunLLM in the thread for followups.
could be due to interpolation given the small sample size. which db are you using?
the p50 is also wrong
sqlite
it does give results different than numpy, but not what you showed in your screenshot
if you want to try it
import numpy as np
ts = [0.59,0.59,0.59,0.59,0.94,1.79,73.2]
for method in ("inverted_cdf","averaged_inverted_cdf","interpolated_inverted_cdf","hazen","weibull","linear","median_unbiased","normal_unbiased"):
print(np.percentile(ts, [0.5, 0.99], method=method), method)
import sqlean
sqlean.extensions.enable_all()
conn = sqlean.connect(":memory:")
conn.execute("create table latency(t)")
conn.execute("insert into latency values" + ",".join(f"({t})" for t in ts))
cur = conn.execute("select median(t), percentile(t,99) from latency")
print(cur.fetchone())
conn.close()%pip install -qqq sqlean.py