The query:
SELECT int(year/10)*10 AS decade, count(*) AS incidents, sum(incidents.dead) AS dead
FROM incidents
WHERE ([cause]='MI' or [cause]='MI?')
GROUP BY (int (year/10)*10);
This produces three columns:
decade | incidents | dead |
---|---|---|
1660 | 1 | 2 |
1800 | 1 | 8 |
1860 | 7 | 27 |
1870 | 4 | 19 |
1880 | 12 | 55 |
1890 | 19 | 78 |
1900 | 43 | 183 |
1910 | 21 | 83 |
1920 | 7 | 31 |
1940 | 2 | 18 |
1960 | 2 | 21 |
1970 | 6 | 27 |
1980 | 8 | 45 |
1990 | 9 | 48 |
2000 | 4 | 55 |
2010 | 11 | 170 |
2020 | 1 | 5 |
You will notice that there are no rows for the decades 1810 through 1859. This is because there are no incidents where the cause was clearly or likely mental illness. (If this seems odd, schizophrenia rates in Britain roughly octupled from the17th to 19th centuries, and similar dramatic increases happen in America over this same period. One possible cause is cats going from mousekillers in barns into homes as pets with toxoplasmosis in cat feces, as one possible environmental trigger.) So how do I get rows with 0 incidents into that query?
I tried:
SELECT int(year/10)*10 AS decade, count(*) AS incidents, sum(Table2.dead) AS 'dead'but the complaint is that "Your query does not include the specified expression int(year/10)*10 as an aggregate function."
FROM Table2
WHERE ((cause_ID=15) or (cause_ID=16))
UNION
SELECT int(year/10)*10 AS decade, count(*) AS incidents, sum(Table2.dead) AS 'dead'
FROM Table2
WHERE ((cause_ID<>15) and (cause_ID<>16))
GROUP BY (int (year/10)*10) ;
SELECT int(year/10)*10 AS decade, count(*) AS incidents, sum(Table2.dead) AS 'dead'
FROM Table2
WHERE ((cause_ID=15) or (cause_ID=16))
GROUP BY (int (year/10)*10)
UNION SELECT int(year/10)*10 AS decade, 0, 0
FROM Table2
WHERE ((cause_ID<>15) and (cause_ID<>16))
GROUP BY (int (year/10)*10);
However, if the first SELECT has matches, the UNION gives me two rows before DECADE, and some decades are still missing:
decade | incidents | 'dead' |
---|---|---|
1650 | 0 | 0 |
1660 | 1 | 2 |
1680 | 0 | 0 |
1690 | 0 | 0 |
1720 | 0 | 0 |
1750 | 0 | 0 |
1780 | 0 | 0 |
1800 | 0 | 0 |
1800 | 1 | 8 |
1820 | 0 | 0 |
1830 | 0 | 0 |
1840 | 0 | 0 |
1850 | 0 | 0 |
1860 | 0 | 0 |
1860 | 7 | 27 |
1870 | 0 | 0 |
1870 | 4 | 19 |
1880 | 0 | 0 |
1880 | 12 | 55 |
1890 | 0 | 0 |
1890 | 19 | 78 |
1900 | 0 | 0 |
1900 | 43 | 183 |
1910 | 0 | 0 |
1910 | 21 | 83 |
1920 | 0 | 0 |
1920 | 7 | 31 |
1930 | 0 | 0 |
1940 | 2 | 18 |
1950 | 0 | 0 |
1960 | 0 | 0 |
1960 | 2 | 21 |
1970 | 0 | 0 |
1970 | 6 | 27 |
1980 | 0 | 0 |
1980 | 8 | 45 |
1990 | 0 | 0 |
1990 | 9 | 48 |
2000 | 0 | 0 |
2000 | 4 | 55 |
2010 | 0 | 0 |
2010 | 11 | 170 |
2020 | 0 | 0 |
2020 | 1 | 5 |
I think instead of a UNION I need to only do the second SELECT if there is no record from the first SELECT.
You need to include the group by in the first half before the union. A union combines two complete queries, so each half has to stand alone.
ReplyDeleteYou need two GROUP BY clauses, one for each SELECT statement. Add a copy of the GROUP BY clause before the UNION.
ReplyDeleteI'm not an expert in SQL, but have you tried putting the same GROUP BY after both WHERE clauses? Your problem might be that you're trying to take the union of two result sets that don't have the same structure, i.e. the first isn't grouped, but the second one is.
ReplyDeleteClayton:
ReplyDeleteTry using conditional sums (sum( case ....)). This will report all years, but only aggregate data for the categories in question.
For example:
SELECT
int(year/10)*10 AS Decade,
SUM(
CASE
WHEN cause_id=15 THEN 1
WHEN cause_id=17 THEN 1
ELSE 0
END
) AS Incidents,
SUM(
WHEN cause_id=15 THEN dead
WHEN cause_id=16 THEN dead
ELSE 0
END
) AS Dead
FROM
Table2
GROUP BY
(int(year/10)*10)
ORDER BY
(int(year/10)*10) ASC;
Good luck!
Wayne