# Thread: Simple moving avg (SMA)

1. Registered User
Join Date
Feb 2010
Location
Montreal, CA
Posts
32

## Unanswered: Simple moving avg (SMA)

Hello,

I’d like to calculate a simple moving average (SMA).
Processing time appears to be long. I’m hoping someone can help me with an efficient syntax.

E.g. tblTemp
ID, Date, Temperature
1, 2010-01-01, 5
2, 2010-01-02, 6
3, 2010-01-03, 5
4, 2010-01-04, 7
5, 2010-01-05, 2
6, 2010-01-06, 1
7, 2010-01-07, 6
8, 2010-01-08, 7
9, 2010-01-09, 9
10, 2010-01-10, 9

A 5-day SMA would require using the last 5 days of temperature data to calculate the average.
I.e. for period 5 use ID=1 to 5, period 6 use ID=2 to 6, etc...

***

This is an oversimplified example; my table will actually contain over 30,000 records and I would like to calculate SMAs for, say, 60, 120, 250 day periods.

Right now, running a SMA=150 for 500 records takes 10 minutes

Can someone help??

Join Date
Nov 2004
Location
out on a limb
Posts
13,692
instead of calculating a SMA have you considered using exponential smoothing instead. the downside is that you'd probably store the smoothed value.

might need a bit of juggling to get the correct smoothing factor
exponential smoothing moving average - Google Search

3. vaguely human
Join Date
Jun 2007
Location
London
Posts
2,527
Originally Posted by Diesel Dud
Right now, running a SMA=150 for 500 records takes 10 minutes
It shouldn't take 10 minutes to run that type of query so just a few questions ...
• Is the id field a foreign key to another table or is it just a unique id for the table?
• What are the indexes on the table?
• Does date appear as the first field in any of these indexes or are you just using id or id and then date?
• Can we see the SQL you are using and the table creation (with indexes) please?

Mike

4. Registered User
Join Date
Feb 2010
Location
Montreal, CA
Posts
32
Originally Posted by healdem
instead of calculating a SMA have you considered using exponential smoothing instead. the downside is that you'd probably store the smoothed value.

might need a bit of juggling to get the correct smoothing factor
exponential smoothing moving average - Google Search
i am open to using an EMA, although i would suspect it would require more processing

5. Registered User
Join Date
Feb 2010
Location
Montreal, CA
Posts
32
Originally Posted by mike_bike_kite
It shouldn't take 10 minutes to run that type of query so just a few questions ...
• Is the id field a foreign key to another table or is it just a unique id for the table?
• What are the indexes on the table?
• Does date appear as the first field in any of these indexes or are you just using id or id and then date?
• Can we see the SQL you are using and the table creation (with indexes) please?

Mike
i feel kinda stupid i was (almost) 100&#37; positive the PK was ID.. just checked this morning and it wasnt indexed properly!!

31,000 records now takes about 70 seconds.. much better!
could i be more efficient in my query?

this e.g. uses a 5-day SMA, i will likely use 150 or 300
SELECT x.Id,
AVG(y.Temperature) as SMA
FROM tblTemp as x,
tblTemp as y
WHERE x.Id>=5 AND x.Id BETWEEN y.Id AND y.Id+4
GROUP BY x.Id
ORDER BY x.Id

thanks!

Join Date
Nov 2004
Location
out on a limb
Posts
13,692
Originally Posted by Diesel Dud
i am open to using an EMA, although i would suspect it would require more processing
nope it requires less processing, althou8gh it does require storing intermediate / last resuklt

the forescat/.average for the next item is CurrentValue + (CurrentValue-LastForecast)*SmoothingFactor

7. vaguely human
Join Date
Jun 2007
Location
London
Posts
2,527
Originally Posted by Diesel Dud
31,000 records now takes about 70 seconds.. much better!
could i be more efficient in my query?
I don't know anything about EMA so I'll just comment on your existing SQL. It can almost certainly be improved - you should be able to make it under a second if that's important to you. A few points though:
• Make your table names more descriptive (tblTemp means nothing to someone trying to understand your code).
• The id field appears unnecessary, and indeed hides the logic of what you are doing, wouldn't it be better to use date_add()? This is must also be calculated somewhere as well so it's just more code to maintain and possibly go wrong.
• Your join looks odd to me but then I'm not totally sure what you're after - perhaps if you could give the sample output using the test data above.

Mike

8. Registered User
Join Date
Feb 2010
Location
Montreal, CA
Posts
32
Originally Posted by mike_bike_kite
I don't know anything about EMA so I'll just comment on your existing SQL. It can almost certainly be improved - you should be able to make it under a second if that's important to you. A few points though:
• Make your table names more descriptive (tblTemp means nothing to someone trying to understand your code).
• The id field appears unnecessary, and indeed hides the logic of what you are doing, wouldn't it be better to use date_add()? This is must also be calculated somewhere as well so it's just more code to maintain and possibly go wrong.
• Your join looks odd to me but then I'm not totally sure what you're after - perhaps if you could give the sample output using the test data above.

Mike
In an effort to simply my problem, I had created a simple example.
I will use the actual example.

`ID` is being used as the PK and is being incremented automatically on the insert.
An alternative would be to use `Dte` and `Tme`, although this seems a little more complex to work with.

--
-- Table structure for table `IntradayPairs`
--

CREATE TABLE IF NOT EXISTS `IntradayPairs` (
`ID` int(11) NOT NULL auto_increment COMMENT 'ID',
`Sym_A` varchar(25) NOT NULL COMMENT 'Pair Symbol A',
`Sym_B` varchar(25) NOT NULL COMMENT 'Pair Symbol B',
`Dte` date NOT NULL COMMENT 'Date',
`Tme` time NOT NULL COMMENT 'Time',
`Sym_A_Cls` decimal(8,4) NOT NULL COMMENT 'Symbol Pair A Closing Price',
`Sym_B_Cls` decimal(8,4) NOT NULL COMMENT 'Symbol Pair B Closing Price',
`Prc_Rto` decimal(6,4) NOT NULL COMMENT 'Price Ratio',
`SMA` decimal(6,4) NOT NULL COMMENT 'Simple Moving Average',
PRIMARY KEY (`ID`),
KEY `ID_Day` (`ID`,`Sym_A`,`Sym_B`,`Dte`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=31501 ;

--
-- Dumping data for table `IntradayPairs`
--

INSERT INTO `IntradayPairs` (`ID`, `Sym_A`, `Sym_B`, `Dte`, `Tme`, `Sym_A_Cls`, `Sym_B_Cls`, `Prc_Rto`, `SMA`) VALUES
(1, 'HD', 'LOW', '2009-08-06', '01:00:00', '26.5000', '22.7500', '1.1648', '0.0000'),
(2, 'HD', 'LOW', '2009-08-06', '01:01:00', '26.5100', '22.7400', '1.1658', '0.0000'),
(3, 'HD', 'LOW', '2009-08-06', '01:02:00', '26.5000', '22.7400', '1.1653', '0.0000'),
(4, 'HD', 'LOW', '2009-08-06', '01:03:00', '26.4900', '22.7400', '1.1649', '0.0000'),
(5, 'HD', 'LOW', '2009-08-06', '01:04:00', '26.5100', '22.7600', '1.1648', '0.0000'),
(6, 'HD', 'LOW', '2009-08-06', '01:05:00', '26.4900', '22.7400', '1.1649', '0.0000'),
(7, 'HD', 'LOW', '2009-08-06', '01:06:00', '26.5000', '22.7600', '1.1643', '0.0000'),
(8, 'HD', 'LOW', '2009-08-06', '01:07:00', '26.5100', '22.7600', '1.1648', '0.0000'),
(9, 'HD', 'LOW', '2009-08-06', '01:08:00', '26.5300', '22.7700', '1.1651', '0.0000'),
(10, 'HD', 'LOW', '2009-08-06', '01:09:00', '26.5300', '22.7700', '1.1651', '0.0000'),
(11, 'HD', 'LOW', '2009-08-06', '01:10:00', '26.5000', '22.7700', '1.1638', '0.0000'),
(12, 'HD', 'LOW', '2009-08-06', '01:11:00', '26.5000', '22.7700', '1.1638', '0.0000'),
(13, 'HD', 'LOW', '2009-08-06', '01:12:00', '26.5200', '22.7800', '1.1642', '0.0000'),
(14, 'HD', 'LOW', '2009-08-06', '01:13:00', '26.5100', '22.7800', '1.1637', '0.0000'),
(15, 'HD', 'LOW', '2009-08-06', '01:14:00', '26.5300', '22.7800', '1.1646', '0.0000'),
(16, 'HD', 'LOW', '2009-08-06', '01:15:00', '26.5400', '22.8000', '1.1640', '0.0000'),
(17, 'HD', 'LOW', '2009-08-06', '01:16:00', '26.5500', '22.8000', '1.1645', '0.0000'),
(18, 'HD', 'LOW', '2009-08-06', '01:17:00', '26.5400', '22.8000', '1.1640', '0.0000'),
(19, 'HD', 'LOW', '2009-08-06', '01:18:00', '26.5500', '22.7900', '1.1650', '0.0000'),
(20, 'HD', 'LOW', '2009-08-06', '01:19:00', '26.5500', '22.7800', '1.1655', '0.0000'),
(21, 'HD', 'LOW', '2009-08-06', '01:20:00', '26.5500', '22.7900', '1.1650', '0.0000'),
(22, 'HD', 'LOW', '2009-08-06', '01:21:00', '26.5600', '22.7900', '1.1654', '0.0000'),
(23, 'HD', 'LOW', '2009-08-06', '01:22:00', '26.5700', '22.8100', '1.1648', '0.0000'),
(24, 'HD', 'LOW', '2009-08-06', '01:23:00', '26.5700', '22.8200', '1.1643', '0.0000'),
(25, 'HD', 'LOW', '2009-08-06', '01:24:00', '26.5700', '22.8300', '1.1638', '0.0000'),
(26, 'HD', 'LOW', '2009-08-06', '01:25:00', '26.5700', '22.8100', '1.1648', '0.0000'),
(27, 'HD', 'LOW', '2009-08-06', '01:26:00', '26.5500', '22.8000', '1.1645', '0.0000'),
(28, 'HD', 'LOW', '2009-08-06', '01:27:00', '26.5500', '22.7800', '1.1655', '0.0000'),
(29, 'HD', 'LOW', '2009-08-06', '01:28:00', '26.5500', '22.7800', '1.1655', '0.0000'),
(30, 'HD', 'LOW', '2009-08-06', '01:29:00', '26.5600', '22.7800', '1.1659', '0.0000'),
(31, 'HD', 'LOW', '2009-08-06', '01:30:00', '26.5800', '22.8100', '1.1653', '0.0000'),
(32, 'HD', 'LOW', '2009-08-06', '01:31:00', '26.6000', '22.8100', '1.1662', '0.0000'),
(33, 'HD', 'LOW', '2009-08-06', '01:32:00', '26.6000', '22.8100', '1.1662', '0.0000'),
(34, 'HD', 'LOW', '2009-08-06', '01:33:00', '26.6000', '22.8000', '1.1667', '0.0000'),
(35, 'HD', 'LOW', '2009-08-06', '01:34:00', '26.5800', '22.8000', '1.1658', '0.0000'),
(36, 'HD', 'LOW', '2009-08-06', '01:35:00', '26.5800', '22.7900', '1.1663', '0.0000'),
(37, 'HD', 'LOW', '2009-08-06', '01:36:00', '26.5600', '22.7800', '1.1659', '0.0000'),
(38, 'HD', 'LOW', '2009-08-06', '01:37:00', '26.5600', '22.7900', '1.1654', '0.0000'),
(39, 'HD', 'LOW', '2009-08-06', '01:38:00', '26.5600', '22.7700', '1.1664', '0.0000'),
(40, 'HD', 'LOW', '2009-08-06', '01:39:00', '26.5400', '22.7400', '1.1671', '0.0000'),
(41, 'HD', 'LOW', '2009-08-06', '01:40:00', '26.5300', '22.7500', '1.1662', '0.0000'),
(42, 'HD', 'LOW', '2009-08-06', '01:41:00', '26.5100', '22.7200', '1.1668', '0.0000'),
(43, 'HD', 'LOW', '2009-08-06', '01:42:00', '26.5000', '22.6900', '1.1679', '0.0000'),
(44, 'HD', 'LOW', '2009-08-06', '01:43:00', '26.4800', '22.7200', '1.1655', '0.0000'),
(45, 'HD', 'LOW', '2009-08-06', '01:44:00', '26.4600', '22.7100', '1.1651', '0.0000'),
(46, 'HD', 'LOW', '2009-08-06', '01:45:00', '26.4600', '22.7200', '1.1646', '0.0000'),
(47, 'HD', 'LOW', '2009-08-06', '01:46:00', '26.4600', '22.7200', '1.1646', '0.0000'),
(48, 'HD', 'LOW', '2009-08-06', '01:47:00', '26.4800', '22.7100', '1.1660', '0.0000'),
(49, 'HD', 'LOW', '2009-08-06', '01:48:00', '26.4400', '22.7000', '1.1648', '0.0000'),
(50, 'HD', 'LOW', '2009-08-06', '01:49:00', '26.4400', '22.7000', '1.1648', '0.0000'),
(51, 'HD', 'LOW', '2009-08-06', '01:50:00', '26.4400', '22.6700', '1.1663', '0.0000'),
(52, 'HD', 'LOW', '2009-08-06', '01:51:00', '26.4400', '22.6800', '1.1658', '0.0000'),
(53, 'HD', 'LOW', '2009-08-06', '01:52:00', '26.4500', '22.6800', '1.1662', '0.0000'),
(54, 'HD', 'LOW', '2009-08-06', '01:53:00', '26.4400', '22.6800', '1.1658', '0.0000'),
(55, 'HD', 'LOW', '2009-08-06', '01:54:00', '26.4000', '22.6600', '1.1650', '0.0000'),
(56, 'HD', 'LOW', '2009-08-06', '01:55:00', '26.4000', '22.6500', '1.1656', '0.0000'),
(57, 'HD', 'LOW', '2009-08-06', '01:56:00', '26.4100', '22.6500', '1.1660', '0.0000'),
(58, 'HD', 'LOW', '2009-08-06', '01:57:00', '26.4100', '22.6300', '1.1670', '0.0000'),
(59, 'HD', 'LOW', '2009-08-06', '01:58:00', '26.4100', '22.6300', '1.1670', '0.0000'),
(60, 'HD', 'LOW', '2009-08-06', '01:59:00', '26.4000', '22.6200', '1.1671', '0.0000'),
(61, 'HD', 'LOW', '2009-08-06', '02:00:00', '26.3800', '22.6100', '1.1667', '0.0000'),
(62, 'HD', 'LOW', '2009-08-06', '02:01:00', '26.3400', '22.5600', '1.1676', '0.0000'),
(63, 'HD', 'LOW', '2009-08-06', '02:02:00', '26.3300', '22.5500', '1.1676', '0.0000'),
(64, 'HD', 'LOW', '2009-08-06', '02:03:00', '26.3300', '22.5600', '1.1671', '0.0000'),
(65, 'HD', 'LOW', '2009-08-06', '02:04:00', '26.3400', '22.5600', '1.1676', '0.0000'),
(66, 'HD', 'LOW', '2009-08-06', '02:05:00', '26.3300', '22.5200', '1.1692', '0.0000'),
(67, 'HD', 'LOW', '2009-08-06', '02:06:00', '26.3100', '22.5200', '1.1683', '0.0000'),
(68, 'HD', 'LOW', '2009-08-06', '02:07:00', '26.3100', '22.5300', '1.1678', '0.0000'),
(69, 'HD', 'LOW', '2009-08-06', '02:08:00', '26.3100', '22.5600', '1.1662', '0.0000'),
(70, 'HD', 'LOW', '2009-08-06', '02:09:00', '26.3100', '22.5600', '1.1662', '0.0000'),
(71, 'HD', 'LOW', '2009-08-06', '02:10:00', '26.3100', '22.5500', '1.1667', '0.0000'),
(72, 'HD', 'LOW', '2009-08-06', '02:11:00', '26.3000', '22.5400', '1.1668', '0.0000'),
(73, 'HD', 'LOW', '2009-08-06', '02:12:00', '26.3000', '22.5400', '1.1668', '0.0000'),
(74, 'HD', 'LOW', '2009-08-06', '02:13:00', '26.3000', '22.5400', '1.1668', '0.0000'),
(75, 'HD', 'LOW', '2009-08-06', '02:14:00', '26.3100', '22.5600', '1.1662', '0.0000'),
(76, 'HD', 'LOW', '2009-08-06', '02:15:00', '26.3100', '22.5400', '1.1673', '0.0000'),
(77, 'HD', 'LOW', '2009-08-06', '02:16:00', '26.3400', '22.5400', '1.1686', '0.0000'),
(78, 'HD', 'LOW', '2009-08-06', '02:17:00', '26.3500', '22.5400', '1.1690', '0.0000'),
(79, 'HD', 'LOW', '2009-08-06', '02:18:00', '26.3500', '22.5300', '1.1696', '0.0000'),
(80, 'HD', 'LOW', '2009-08-06', '02:19:00', '26.3400', '22.5300', '1.1691', '0.0000');

SET `Prc_Rto` = `Sym_A_Cls` / `Sym_B_Cls;

9. Registered User
Join Date
Feb 2010
Location
Montreal, CA
Posts
32
I would like to calculate the SMA for column `Prc_Rto`

The following query takes 0.0003 seconds to execute on this data set on 80 records:
SELECT x.Id,
AVG(y.`Prc_Rto`) as SMA
WHERE x.Id>=5 AND x.Id BETWEEN y.Id AND y.Id+4
GROUP BY x.Id
ORDER BY x.Id

I will have 30,000+ records in the actual table and would like the optimal way to code this calculation.

Thanks.

10. vaguely human
Join Date
Jun 2007
Location
London
Posts
2,527
Originally Posted by Diesel Dud
I will use the actual example.
I see you want to store the SMA value in the table rather than calculate it on the fly - which is fine by the way assuming you cope with price updates etc altering the SMA. Because you're only updating the data infrequently it shouldn't really matter whether it takes 70 secs or not. It seems silly to spend your time trying to bring this time down unless there's a good reason. A side benefit will be that I won't need to work my way through your data and SQL

11. Registered User
Join Date
Feb 2010
Location
Montreal, CA
Posts
32
Originally Posted by mike_bike_kite
I see you want to store the SMA value in the table rather than calculate it on the fly - which is fine by the way assuming you cope with price updates etc altering the SMA. Because you're only updating the data infrequently it shouldn't really matter whether it takes 70 secs or not. It seems silly to spend your time trying to bring this time down unless there's a good reason. A side benefit will be that I won't need to work my way through your data and SQL
calculation of the SMA is one of several steps needed to determine if the pair is profitable; e.g. after SMA, a simple moving avg for the standard deviation will be calculated

the process will be repeated for periods of 60, 120, 240, and 390 periods

these are just some of the steps which will be executed on this workfile of 30,000 records; the workfile will then be flushed and the next pair will undergo the same process; i have hundreds of pairs to process.

as you can see, if i can shave off some time calculating these SMAs, total process time savings will be huge

12. vaguely human
Join Date
Jun 2007
Location
London
Posts
2,527
Originally Posted by Diesel Dud
as you can see, if i can shave off some time calculating these SMAs, total process time savings will be huge
The savings may well be huge but I joined the thread under the impression it was a scientific issue - you said originally that it was temperature you were modelling rather than prices. I make my living working with financial databases and just feel uncomfortable doing it for free. My apologies but I'm sure someone else will chip in soon to help you out.

Mike

13. Registered User
Join Date
Feb 2010
Location
Montreal, CA
Posts
32
.....bump

14. Registered User
Join Date
Feb 2010
Location
Montreal, CA
Posts
32
....... bump

Join Date
Nov 2004
Location
out on a limb
Posts
13,692