Friday, December 17, 2021

Relative solar panel power output in Flux

The problem: I want to display solar generation data as a percentage of the max output each individual inverter has generated over its lifetime.  That will let me more easily compare panels of widely varying power outputs, as we have some huge arrays (2 megawatts) and some tiny ones (1000-2000 watts).

In an SQL like query language what I'm looking for is something like this:

SELECT uuid, _time, W / MAX(W) ...

but how do you do that with flux?  The challenge gets back to being able to think about queries in terms of streams that you later join together.  I solved the problem this way:

// For each inverter, find the maximum power this year
// The result is a series of maximums (one per tag set, which
// in this case resolves to individual inverters)
wmax = from(bucket: "vpp")
  |> range(start2021-01-01T00:00:00Z, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "data" and r["_field"] == "W")
  |> max()

// Now pull out just the power output - this is the time series data
wdata = from(bucket: "vpp")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "data" and r["_field"] == "W")
  |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: false)

 // Finally, Join these together on uuid, the result will contains columns
// including _time_wdata, _value_wmax and _value_wdata. Follow the join
// with a map() that actually calculates the percentage (and as as side
// effect filters out extra tags/columns so that it displays nicely in
// grafana)
join(tables: {wdata: wdata, wmax: wmax}, on: ["uuid"] )
|> map( fn: (r) => ({
uuid: r.uuid,
_time: r._time_wdata,
_value: r._value_wdata/r._value_wmax
|> yield()

The result gives you a set of relative power curves normalized to each system's max this year

For performance reasons, it's best to avoid any kind of map() in the wmax or wdata queries.  The goal for things like this is to use the pushdown pattern, meaning use functions that let the query be performed (and the result set reduced in size) at the storage layer, rather than in memory.  This blog post does a good job of explaining this, and the pushdown patterns are explained Here.  Note that map() can't be pushed down, so in general I think the best strategy is to avoid map() until the very end - after the results have been filtered and stats like min(), max(), mean() have already run on the storage side.

Flux is a different way of thinking.  If you have spent decades thinking in SQL (like me), it takes a little time to get used to this.  Flux tends to be wordier, but what you get out of it is something that's much more flexible, so I think it's worth it.

I hope these flux notes help others.  I've grown to like flux a lot, but I found the learning curve to be a little steep, and whenever I'm faced with a new type of problem I have to really think about the data model and what is coming out of each step in the stream I set up.  I tend to bug the InfluxData staff, but fortunately they have been very patient and helpful.

No comments:

Post a Comment