Nerdery

Sunday, April 21, 2024

WRX VB Maintenance Schedule

Just for fun, I recreated the warranty maintenance schedule for my new WRX in Postgres. G-Two has created a python library for interacting with the Subaru Starlink service, which lets you get the odometer (and a few other things) for as long as you have the service (you get it free the first three years).

The car already tells you about upcoming maintenance all on its own, but tracking it myself sounded like a fun, short-term project. Eventually it will give me a place to store scanned copies of maintenance invoices as well. My WRX is coming from Van Bortel in Henrietta, but I noticed their maintenance records link to Van Bortel in Victor isn't 100% reliable - they can usually find out what has been done to my car but sometimes it takes a while.

Drop me a note and tell me if this is useful. If you'd like to collaborate to make it better I'm open to that, too - I'll start a github project.

CREATE TABLE public.maintenance_items (
    id integer NOT NULL,
    description text
);

CREATE TABLE public.maintenance_schedule (
    id integer NOT NULL,
    maintenance_item_id integer NOT NULL,
    months integer,
    miles integer,
    inspect boolean,
    replace boolean,
    perform boolean
);

INSERT INTO public.maintenance_items VALUES (1, 'Engine oil');
INSERT INTO public.maintenance_items VALUES (2, 'Engine oil filter');
INSERT INTO public.maintenance_items VALUES (3, 'Spark plugs');
INSERT INTO public.maintenance_items VALUES (4, 'Drive belts');
INSERT INTO public.maintenance_items VALUES (5, 'Fuel systems, lines, and connectors');
INSERT INTO public.maintenance_items VALUES (6, 'Fuel filter');
INSERT INTO public.maintenance_items VALUES (7, 'Air cleaner element');
INSERT INTO public.maintenance_items VALUES (8, 'Cooling system, hoses, & connections');
INSERT INTO public.maintenance_items VALUES (9, 'Engine coolant');
INSERT INTO public.maintenance_items VALUES (10, 'Clutch operation');
INSERT INTO public.maintenance_items VALUES (11, 'Transmission gear oil');
INSERT INTO public.maintenance_items VALUES (12, 'N/A');
INSERT INTO public.maintenance_items VALUES (13, 'Front & rear differential gear oil');
INSERT INTO public.maintenance_items VALUES (14, 'Brake lines, operation of parking & service brake systems');
INSERT INTO public.maintenance_items VALUES (15, 'Brake fluid / clutch fluid');
INSERT INTO public.maintenance_items VALUES (16, 'Disc brake pads & rotor, front & rear axle boots & joints');
INSERT INTO public.maintenance_items VALUES (17, 'Steering & suspension');
INSERT INTO public.maintenance_items VALUES (18, 'Wheel bearings');
INSERT INTO public.maintenance_items VALUES (19, 'Rotate & inspect tires');
INSERT INTO public.maintenance_items VALUES (20, 'HVAC system A/C filter');


INSERT INTO public.maintenance_schedule VALUES (1, 1, 6, 6000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (132, 19, 6, 6000, false, false, true);
INSERT INTO public.maintenance_schedule VALUES (133, 19, 12, 12000, false, false, true);
INSERT INTO public.maintenance_schedule VALUES (134, 19, 18, 18000, false, false, true);
INSERT INTO public.maintenance_schedule VALUES (135, 19, 24, 24000, false, false, true);
INSERT INTO public.maintenance_schedule VALUES (136, 19, 30, 30000, false, false, true);
INSERT INTO public.maintenance_schedule VALUES (137, 19, 36, 36000, false, false, true);
INSERT INTO public.maintenance_schedule VALUES (138, 19, 42, 42000, false, false, true);
INSERT INTO public.maintenance_schedule VALUES (139, 19, 48, 48000, false, false, true);
INSERT INTO public.maintenance_schedule VALUES (140, 19, 54, 54000, false, false, true);
INSERT INTO public.maintenance_schedule VALUES (141, 19, 60, 60000, false, false, true);
INSERT INTO public.maintenance_schedule VALUES (142, 19, 66, 66000, false, false, true);
INSERT INTO public.maintenance_schedule VALUES (143, 19, 72, 72000, false, false, true);
INSERT INTO public.maintenance_schedule VALUES (144, 19, 78, 78000, false, false, true);
INSERT INTO public.maintenance_schedule VALUES (145, 19, 84, 84000, false, false, true);
INSERT INTO public.maintenance_schedule VALUES (146, 19, 90, 90000, false, false, true);
INSERT INTO public.maintenance_schedule VALUES (147, 19, 96, 96000, false, false, true);
INSERT INTO public.maintenance_schedule VALUES (148, 19, 102, 102000, false, false, true);
INSERT INTO public.maintenance_schedule VALUES (149, 19, 108, 108000, false, false, true);
INSERT INTO public.maintenance_schedule VALUES (150, 19, 132, 132000, false, false, true);
INSERT INTO public.maintenance_schedule VALUES (2, 1, 12, 12000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (3, 1, 18, 18000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (4, 1, 24, 24000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (5, 1, 30, 30000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (6, 1, 36, 36000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (7, 1, 42, 42000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (8, 1, 48, 48000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (9, 1, 54, 54000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (10, 1, 60, 60000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (11, 1, 66, 66000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (12, 1, 72, 72000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (13, 1, 78, 78000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (14, 1, 84, 84000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (15, 1, 90, 90000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (16, 1, 96, 96000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (17, 1, 102, 102000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (18, 1, 108, 108000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (19, 1, 132, 132000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (20, 2, 6, 6000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (21, 2, 12, 12000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (22, 2, 18, 18000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (23, 2, 24, 24000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (24, 2, 30, 30000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (25, 2, 36, 36000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (26, 2, 42, 42000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (27, 2, 48, 48000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (28, 2, 54, 54000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (29, 2, 60, 60000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (30, 2, 66, 66000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (31, 2, 72, 72000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (32, 2, 78, 78000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (33, 2, 84, 84000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (34, 2, 90, 90000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (35, 2, 96, 96000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (36, 2, 102, 102000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (37, 2, 108, 108000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (38, 2, 132, 132000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (39, 3, 60, 60000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (40, 3, 120, 120000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (41, 4, 30, 30000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (42, 4, 60, 60000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (43, 4, 90, 0, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (44, 4, 120, 120000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (45, 5, 30, 30000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (46, 5, 60, 60000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (47, 5, 90, 0, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (48, 5, 120, 120000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (49, 6, 72, 72000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (50, 7, 30, 30000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (51, 7, 60, 60000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (52, 7, 90, 0, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (53, 7, 120, 120000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (54, 8, 30, 30000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (55, 8, 60, 60000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (56, 8, 90, 0, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (57, 8, 120, 120000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (58, 9, 11, 137500, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (59, 10, 12, 12000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (60, 10, 24, 24000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (61, 10, 36, 36000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (62, 10, 48, 48000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (63, 10, 60, 60000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (64, 10, 72, 72000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (65, 10, 84, 84000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (66, 10, 96, 96000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (67, 10, 108, 108000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (68, 10, 120, 120000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (69, 10, 132, 132000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (70, 11, 30, 36000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (71, 11, 60, 60000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (73, 11, 120, 120000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (74, 12, 30, 36000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (76, 12, 96, 96000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (72, 11, 90, 90000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (78, 13, 30, 36000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (79, 13, 60, 60000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (80, 13, 96, 96000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (81, 13, 120, 120000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (75, 12, 60, 60000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (77, 12, 120, 120000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (82, 14, 12, 12000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (83, 14, 24, 24000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (84, 14, 36, 36000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (85, 14, 48, 48000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (86, 14, 60, 60000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (87, 14, 72, 72000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (88, 14, 84, 84000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (89, 14, 96, 96000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (90, 14, 108, 108000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (91, 14, 120, 120000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (92, 14, 132, 132000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (93, 15, 30, 36000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (94, 15, 60, 60000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (95, 15, 96, 96000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (96, 15, 120, 120000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (97, 16, 12, 12000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (98, 16, 24, 24000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (99, 16, 36, 36000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (100, 16, 48, 48000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (101, 16, 60, 60000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (102, 16, 72, 72000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (103, 16, 84, 84000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (104, 16, 96, 96000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (105, 16, 108, 108000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (106, 16, 120, 120000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (107, 16, 132, 132000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (108, 17, 12, 12000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (109, 17, 24, 24000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (110, 17, 36, 36000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (111, 17, 48, 48000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (112, 17, 60, 60000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (113, 17, 72, 72000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (114, 17, 84, 84000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (115, 17, 96, 96000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (116, 17, 108, 108000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (117, 17, 120, 120000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (118, 17, 132, 132000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (119, 18, 60, 60000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (120, 18, 120, 120000, true, false, false);
INSERT INTO public.maintenance_schedule VALUES (121, 20, 12, 12000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (122, 20, 24, 24000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (123, 20, 36, 36000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (124, 20, 48, 48000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (125, 20, 60, 60000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (126, 20, 72, 72000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (127, 20, 84, 84000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (128, 20, 96, 96000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (129, 20, 108, 108000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (130, 20, 120, 120000, false, true, false);
INSERT INTO public.maintenance_schedule VALUES (131, 20, 132, 132000, false, true, false);

Thursday, April 18, 2024

Timestream live dashboards in Grafana

Amazon Timestream is a strange beast. It is a time series "database" but under the hood it behaves like a bunch of flat files that are partitioned by time and something else. That something could be the measurement name (this is the default) but it could also be dimension that you specify. You better specify them carefully, though, because once the table is created you can't ever change them.

Partitioning is automatic. If you insert data as time, measurement_name, value, then at the beginning all your rows will go into a few (maybe one) partitions. As your data grows, partitions get larger and eventually new partitions are created. Distinct measurements can be within a single partition or spread across many partitions. There is no way to really tell how this works - under the hood, partitions may be split (but they probably aren't) or new partitions might be appended and indexed. It keeps track of what measurements and time ranges are within each partition using an index. This mean that certain queries, like asking for data for a specific measurement and time range, are faster - it can figure out what blocks to look in using the index. But you pay for each gigabyte scanned. If you ask for every measurement for a short time range, you can't really predict how many partitions will be scanned.

Amazon tells you to trust them - they will make new partitions when needed in order to minimize query costs. In Grafana, though, you still have to be very careful. In the case where you have only a few measurements, then a given time range (say the last 15 minutes) might be in a single partition, with lots of measure_names in it. If you query them one by one, then you might be paying to scan the same data multiple times, and paying $0.01/GB scanned every time.

In my case, measure_name has pretty low cardinality - there are maybe 50 distinct measurements. What that means is that my partitions are likely by time only. For queries that get the most recent data - for example the one below - that isn't necessarily a good thing.

SELECT DISTINCT
measure_name,
COALESCE(
FIRST_VALUE(measure_value::double) OVER (
PARTITION BY measure_name
ORDER BY time DESC
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
),
CASE WHEN FIRST_VALUE(measure_value::boolean) OVER (
PARTITION BY measure_name
ORDER BY time DESC
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) THEN 1 ELSE 0 END
) AS "value",
FIRST_VALUE(time) OVER (
PARTITION BY measure_name
ORDER BY time DESC
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS "last update"
FROM $__database.$__table
WHERE time >= ago(5m)
ORDER BY measure_name

I thought I could put this query into a table, hide the table, then have 20 individual gauges refer to its data using the table visualization as the datasource for the remaining dials and gauges. Turns out that's not how Grafana works. Instead of fetching the data once, it runs this query once per gauge. The result is that my development (of the dashboards) is costing me $0.20/hr in timestream queries. Not a huge deal, but still kind of annoying.

What Grafana really needs is the concept of a 'dashboard query' that can be a datasource for multiple visualizations on the dashboard. I'll be putting this in as a feature request soon.

The point is this: Timestream works very well with Grafana, but construct your dashboards and queries very, very carefully.

Sunday, March 3, 2024

Hot Tub Monitor, Part 1

Today I am working on a better hot tub monitor. My hot tub has three separate pumps. I want to be able to monitor its energy consumption in greater detail, and to have it send me alerts in the event that the heater fails.

To make it fun (and cheap) I decided to use the Shelly I4 Plus module coupled with the I/O Addon. To this I added three DS18B20 temperature sensors and a couple of AC current switches (two shown below, two more to be added later). The current switches are self-powered and act like a relay contact that closes when the current to each motor is above a few hundred milliamps. I already have a current monitor, so I don't need to know the actual current - all I'm looking for here is state monitoring.

You can run Tasmota on these Shelly devices, but for this I don't need it. I do, however, want to be able to control the update rate. Initially I am just going to report state changes on a fixed interval, but the intent is to implement reporting on state change with a minimum reporting rate if temperature or state has not changed. This will allow me to get an immediate report when the motors start, but not overwhelm my time-series database with a lot of duplicate data. To accomplish this I will use the built-in scripting that the Shelly Gen 3 modules support. I will improve this script later to implement the report-by-exception but to keep it simple I want to show an example of how the scripts work.

function updateTemps() {
  Shelly.call("Temperature.Getstatus", { id: 100 },
    function (response) {
       MQTT.publish("hottub/t1", JSON.stringify(response.tF), 0, false);
    }
  )
   
   Shelly.call("Temperature.Getstatus", { id: 101 },
    function (response) {
       MQTT.publish("hottub/t2", JSON.stringify(response.tF), 0, false);
    }
  )
   
   Shelly.call("Temperature.Getstatus", { id: 102 },
    function (response) {
       MQTT.publish("hottub/t3", JSON.stringify(response.tF), 0, false);
    }
  )
}

let tempTimer = null;

function start() {
    print("starting");
    tempTimer = Timer.set(1000, true, updateTemps, null);
}
 
function stop() {
  print("stopping");
  Timer.clear(tempTimer)
}

The device is already configured to connect to my MQTT broker.  Updates are done by making an RPC call to the Temperature.GetStatus function in the Shelly.  Each sensor has a unique ID; the three external sensors are Inumbers 100, 101, and 102.  You can get a sensor list by calling SensorAddon.GetPeripherals which returns a little bit of JSON:

{
    "digital_in": {},
    "ds18b20": {
        "temperature:100": {
            "addr": "40:118:243:67:212:9:96:228"
        },
        "temperature:101": {
            "addr": "40:204:240:67:212:47:73:182"
        },
        "temperature:102": {
            "addr": "40:242:0:67:212:231:49:22"
        }
    },
    "dht22": {},
    "analog_in": {},
    "voltmeter": {}
}

All of these RPC calls are accessible through the either your browser or curl, so it's easy to figure out the returned message structure (but the documentation is also very clear).

Happy hacking!

Wednesday, February 7, 2024

IPv6 Only Ubuntu instance on Amazon Web Services

Pay Per IPv4 Address?

You're probably aware of the AWS plan charge for IPv4 Addresses. The costs for an address is$0.005 per hour, which amounts to about $3.65/month. That's not going to break the bank, but if you use a few tiny instances (t3.micro or even t3.nano) you might pay more for the IP address than for the virtual machine it's attached to.

This made me wonder: weren't we all supposed to be on IPv6 by now anyway?

Secondarily, I wondered if I can create an EC2 instance that has only IPv6, and would there be problems? I decided to try.

Configuring IPv6 on EC2

Setting up IPv6 on EC2 takes a little work, most of it being done in the VPC. Amazon has documentation that explains it. The gist is:

Associate a public IPv6 border address block to your VPC. To do this, open the VPC console, select the VPC your instances are on, and edit the CIDR blocks. Add an "Amazon-provided IPv6 CIDR block" associated with your ec2 region or AZ
Create at least one subnet within that block. Typically this will be a subset of the block (for example a /64). Your instances will choose an address from within this subnet.
Make sure your EC2 instance belongs to this VPC/subnet (if it doesn't already) and auto-assign it an address
Update the routing tables to include an IPv6 default rout

My instructions here are very incomplete, but the Amazon instructions are good (though long), so follow those.

APT Updates

The first pain point I ran into was breaking apt. There are two reasons for this. One is that you have to specifically configure apt to use IPv6. You can do this by adding a file:

root# cat /etc/apt/apt.conf.d/1000-force-ipv6-transport
Acquire::ForceIPv6 "true";

Second, if you built your Ubuntu instance from one of the EC2 templates, it's going to have apt repositories listed that don't support IPv6. In my case this was us-east-1.ec2.ports.ubuntu.com. It surprises me that there would be such a repository, obviously within AWS, but without an IPv6 address.

To fix this, I changed /etc/apt/sources.list to point toward a more generic Ubuntu package source:

deb http://ports.ubuntu.com/ubuntu-ports/ jammy main restricted
deb http://ports.ubuntu.com/ubuntu-ports/ jammy-updates main restricted
deb http://ports.ubuntu.com/ubuntu-ports/ jammy universe
deb http://ports.ubuntu.com/ubuntu-ports/ jammy-updates universe

This probably has some speed implications, and most likely some cost implications as well as the package repository is no longer within the AWS region as your instance, but that is likely to cost you pennies at most and the speed isn't that important for regular patching and updates.

Is it connectable?

All the testing I have done so far has been fine. I work at a place that has good IPv6 infrastructure, as does my cable internet service (Spectrum). Mobile devices seem to be ahead of the curve with respect to IPv6. So, for my use case, IPv6 only seems fine. I do have concerns, though, about connectivity - how many IPv4-only clients are out there? No idea. I don't think I would risk it for a truly "production" application until I understand the answer to that question.

Docker

Docker containers running within the EC2 instance are a little more problematic. If those containers need to reach out to the internet for any reason, they won't work unless you specifically enable IPv6 (or set up some kind of proxy) for them. That's an issue I was abled to solve, and I will explain how another day.

Friday, December 17, 2021

Relative solar panel power output in Flux

The problem: I want to display solar generation data as a percentage of the max output each individual inverter has generated over its lifetime. That will let me more easily compare panels of widely varying power outputs, as we have some huge arrays (2 megawatts) and some tiny ones (1000-2000 watts).

In an SQL like query language what I'm looking for is something like this:

SELECT uuid, _time, W / MAX(W) ...

but how do you do that with flux? The challenge gets back to being able to think about queries in terms of streams that you later join together. I solved the problem this way:

//
// For each inverter, find the maximum power this year
// The result is a series of maximums (one per tag set, which
// in this case resolves to individual inverters)
//
wmax = from(bucket: "vpp")
  |> range(start: 2021-01-01T00:00:00Z, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "data" and r["_field"] == "W")
  |> max()

//
// Now pull out just the power output - this is the time series data
//
wdata = from(bucket: "vpp")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "data" and r["_field"] == "W")
  |> aggregateWindow(every: v.windowPeriod, fn: mean, createEmpty: false)

 //
 // Finally, Join these together on uuid, the result will contains columns
 // including _time_wdata, _value_wmax and _value_wdata.  Follow the join
 // with a map() that actually calculates the percentage (and as as side
 // effect filters out extra tags/columns so that it displays nicely in
 // grafana)
 //
join(tables: {wdata: wdata, wmax: wmax}, on: ["uuid"] ) 
  |> map( fn: (r) => ({
        uuid: r.uuid,
        _time: r._time_wdata,
        _value: r._value_wdata/r._value_wmax
       })
     )
  |> yield()

The result gives you a set of relative power curves normalized to each system's max this year

For performance reasons, it's best to avoid any kind of map() in the wmax or wdata queries. The goal for things like this is to use the pushdown pattern, meaning use functions that let the query be performed (and the result set reduced in size) at the storage layer, rather than in memory. This blog post does a good job of explaining this, and the pushdown patterns are explained Here. Note that map() can't be pushed down, so in general I think the best strategy is to avoid map() until the very end - after the results have been filtered and stats like min(), max(), mean() have already run on the storage side.

Flux is a different way of thinking. If you have spent decades thinking in SQL (like me), it takes a little time to get used to this. Flux tends to be wordier, but what you get out of it is something that's much more flexible, so I think it's worth it.

I hope these flux notes help others. I've grown to like flux a lot, but I found the learning curve to be a little steep, and whenever I'm faced with a new type of problem I have to really think about the data model and what is coming out of each step in the stream I set up. I tend to bug the InfluxData staff, but fortunately they have been very patient and helpful.