I have a 1.7 million row CSV I would like to filter and serve from a static function (something like Google Cloud functions). Best approach? #javascript #nodejs
@tqft R is handy for doing the manipulation, but I've done all that. Just want to serve it up over the network—ideally without running my own server.
I think a node function backed by an sqlite database will be good enough for now.
@drzax i think something like this is what i was think of, it's on my very long list of things to look at https://stackoverflow.com/questions/23750096/publish-rstudio-shiny-app-in-intranet R Shiny
@tqft AFAIK, Shiny is for interactive data vis. I just want to serve the data itself.
@drzax serve to whom, how?
@DenubisX To any punter on the web, filtered by a value in a specific column.
I think my issue is actually with the performance of Firebase cloud functions, not my implementation, per se.
@drzax so my first reaction is SQL.js. 1.7m rows as flitered strikes me as client side problem.
@DenubisX Data is a little heavy for client side, I think. I'd rather not throw ~30MB down the pipe for every request. (Though it might be quite a bit smaller after gzip).
@drzax mrf. Ok, so... what's the data and how many hits and what sort of caching and who/what will use it?
@drzax https://simonwillison.net/2017/Nov/13/datasette/ I don't think that serves the whole db?
@drzax and queries are cached behind cloudflare, so... yeah, it can't serve the whole db. But for 1.7m rows in arbitrary data, that feels like it fits well and scales super-well.
@drzax though as it limits results to 1k rows, mmm, not so much.
Perhaps drop it into a google fusion table and make it public?
@DenubisX Oh, I totally forgot about datasettee, and I'd even been looking for an excuse to use it. That might be perfect.
@drzax bit late to the party, but I reckon that sounds like a good plan.
@ashkyd Turns out Firsbase functions are pretty slow at this. I was seeing 20 second response times. 😳
Throwing it in a sqlite db and doing basically the same thing (using sqlite node module) makes it better, but still ~5000ms response time.
I'm pretty confident it's a Firebase functions processing capacity issue, rather than my implementation.
@drzax That's a concern, I wonder if streaming the data would help at all. But also super congrats btw :D
@ashkyd Thanks :-)
@ashkyd Also, it's super weird that this thing auto corrects :-) to 😞
https://bne.social/media/JLZq-Ppo5QlGISQy63U
@drzax Hah, that is fun. I think you're triggering the slack-style emoji (colon+word+colon) search, for which the first "d" entry is disappointed. 🐵
@drzax (I should update it actually, we're a little behind again)
@drzax sounds like a problem for R . I haven't donecloud stuff with it though.