Brutkey

SnoopJ
@SnoopJ@hachyderm.io

I feel like this is a good point in this project to ask:

What other
#Boston-area theatres should I know about that do appreciable numbers of special film screenings?

So far, I'm keeping track of:

Coolidge Corner Theatre
Somerville Theatre
The Brattle
Alamo Drafthouse
Landmark Kendall Square Cinema

But what else should I know about? I am probably going to add The Capitol and Apple Cinemas to this list.

Anything reachable by the T/commuter rail or bike from downtown is of interest.

#SomervilleMA #CambridgeMA

SnoopJ
@SnoopJ@hachyderm.io

Still mulling the problem of how to make it easy to dismiss mass-market stuff when the reader of the calendar considers that noise.

Maybe it makes sense to rank titles by their frequency in the filter checkbox? But then it's harder to scan for the one you want. Would a search bar alleviate
that problem?

My dream here is a "hide new megacorp releases" button but it's a tough nut to crack.


SnoopJ
@SnoopJ@hachyderm.io

Especially since I'm not opposed to showing new releases on this calendar, it's just that they aren't really the goal.

Like, I wanna see WEAPONS, so this will probably be useful to select a showing of WEAPONS to attend. But I'm not primarily going to be looking at this calendar to find showings like that.

SnoopJ
@SnoopJ@hachyderm.io

But it does seem reasonable to assume that anything sufficiently in thrall to Money Bastards will appear multiple times in the data.

The main problem is that I would need to retain a window into the past because I don't want F1ยฎยฎ to show up as not-a-new-mass-market-release on its last day of screening because of window effects.

Feel like I'm overthinking this but I haven't gotten to the "oh, right, duh" part yet.

SnoopJ
@SnoopJ@hachyderm.io

A more immediate problem that wants solving is looking out further than a week. Many of these data streams have data looking that far out, although many of the more interesting cinemas fall off faster

SnoopJ
@SnoopJ@hachyderm.io

wild to me how much javascript is on these sites to do so very much nothing

single script of out a dozen on one page at 4000 lines when prettified, likeโ€ฆ buddy, my thing is currently 624 lines total and I cannot imagine it would take more than 1000 of JS to make it REAL schmick

I know that comparison on LoC is vague at best and I know that webtech exists to serve ads and everything else is a side effect

but goddamn

SnoopJ
@SnoopJ@hachyderm.io

wow, Regent Theatre's use of a commercial offering called EventON to store their event data really made writing a provider for their events quite a nuisance

features include:

* shocking number of form fields (may be PHP/WP data?)
* date range fields that are ignored when satisfying the request
* multiple nonces re-used for every request
* serving HTML over JSON
* putting more information in that HTML than the sibling JSON metadata
* inability to link to a particular month in the on-page calendar schedule
* no event pages to link to :(

most of those are the software's fault, although the last one feels like the theatre gets some credit. oh well, I'll link to the main schedule page and the user can figure it out, the calendar does tell them what day the event is on

SnoopJ
@SnoopJ@hachyderm.io

but in the end, I won, their movies now appear on the calendar

SnoopJ
@SnoopJ@hachyderm.io

does present me with an interesting conundrum of what I will do when I aggregate the info from all of these into a single stream that other people can consumeโ€ฆ not sure if people will want info about non-movies from a project mostly focusing on movies, but OTOH I'm already doing the necessary scraping work for themโ€ฆ

SnoopJ
@SnoopJ@hachyderm.io

The API backing Apple Cinemas has several eyebrow-raisers:

* TLS fingerprinting by CloudFlare is enabled
* Typos in the API (Location vs. Loction)
* Redundant/ignored query parameters (movieID vs actualMovidId, end datetime for a range query does not matter, does not even need to be after start datetime)

This one is I think going to require me to send 1 + Nmovies*Ndays requests

SnoopJ
@SnoopJ@hachyderm.io

Writing up my provider code for Apple Cinemas and find myself writing the following in a comment explaining that the query route ignores the end date:

# We'll set this parameter "right" anyway, as a prayer for that messy API's soul.

SnoopJ
@SnoopJ@hachyderm.io

caching N*M web requests is mildly annoying but the alternative is to cache after I've started to munge things and I kinda hate doing that in an application like this.

it's easy enough to cache by filename with
{actualMovieID}_{query_start_date}_{query_performed_date}.json, just annoying

SnoopJ
@SnoopJ@hachyderm.io

I probably need a caching implementation that I can re-use between providers. I keep re-writing the simple parts of that right where it's needed.

SnoopJ
@SnoopJ@hachyderm.io

that N*M is 155, by the way. not really enough to justify being annoyed about it, but enough to be annoying

SnoopJ
@SnoopJ@hachyderm.io

but spewing 155 files to disk every day, on the other hand, that's another level of obnoxious. It's only 3 MB of data (on the other hand, it's 3 MB of data!) but yea, maybe I should collate these requests and serialize the function with the loop that creates them.

SnoopJ
@SnoopJ@hachyderm.io

In which I forget to de-dupe

SnoopJ
@SnoopJ@hachyderm.io

enhance

SnoopJ
@SnoopJ@hachyderm.io

In addition to their API beingโ€ฆ idiosyncraticโ€ฆ the Southeast Asian showings at Apple Cinema are also an interesting edge case. These are separate showings and would not fit into the false dichotomy of dub/sub that one might be tempted to adopt for this domain:

Coolie (Tamil)
Coolie (Telugu)
Coolie (Hindi)

I could see the case for collapsing those to a single listing (especially if the languages could be combined) but I don't think I will bother for now, it's not as big a problem for the reader as mass-market movies are

SnoopJ
@SnoopJ@hachyderm.io

okay, I have done the needful and now have a re-usable cache mechanism, so I can stop writing that half-a-dozen lines again and again

I settled on being okay with caching the "showings by date" internal structure that is the output of each provider before they go to the global gather. It's discarding a lot of data from the cache, but that'sโ€ฆ fine.

The cache is really just there so we don't hammer the upstreams, and if I
really want HTTP-layer caching, I can go get someone else's solution for it and plug that in.

SnoopJ
@SnoopJ@hachyderm.io

several seconds later: ah, crap, I just realized that I'm missing the serialization of one of those types

SnoopJ
@SnoopJ@hachyderm.io

well, fixing the caching took a while, but now that's sortedโ€ฆ I hopeโ€ฆ

In the process I also confirmed my theory from yesterday that Regent Theatre's API nonce changes on a daily basis. Thankfully, it's in the HTML served by the schedule root page, so it's just another web request and a pattern-match.

There is I guess potential for TOCTOU with that nonce if the program execution crosses the day boundary, but that's easy: I will "just" put this program on a timer that avoids that problem :D

SnoopJ
@SnoopJ@hachyderm.io

Coolidge Corner's film pages are taking a snooze a lot. I guess this is the point where I am willing to send them an email so some computer-toucher can have a look at what's going on there

SnoopJ
@SnoopJ@hachyderm.io

hmm, or maybe it's only some of the pages for the Kurosawa series?

what the heck, I'll see if the issue keeps happening once the series is open, and then I'll reach out if it still is

SnoopJ
@SnoopJ@hachyderm.io

yet another example of this calendar doing EXACTLY what I want it to do:

cool, there's a showing of ASHES OF TIME on Sunday. CHUNGKING EXPRESS is the only work of Wong Kar-wai that I've seen (it's also showing, on Saturday) but I would love to experience more of his work
and scratch the martials-arts movie itch

SnoopJ
@SnoopJ@hachyderm.io

it's 35mm projection too :D

SnoopJ
@SnoopJ@hachyderm.io

hmm, uh oh, looks like I'm missing some showings for Somerville Theatre though

well, that's a problem for tomorrow-Snoop, I guess

SnoopJ
@SnoopJ@hachyderm.io

OR MAYBE NOW-SNOOP

I think I figured it out

SnoopJ
@SnoopJ@hachyderm.io

yep, I figured it out, I was discarding shows that had already appeared on other days rather than just consolidating repeated showings of the same movie on the same day

easy enough fix, change the 'seen' key to a tuple of
(date, film_id) instead of just film_id

SnoopJ
@SnoopJ@hachyderm.io

Another bug: looks like Alamo lists private events in their JSON API, so I'll have to put in a special rejection filter for those

SnoopJ
@SnoopJ@hachyderm.io

I wonder what happens if you buy a ticket for a listed private event (it does seem to let you do this)

SnoopJ
@SnoopJ@hachyderm.io

I mean, I'm sure they would turn you away if you tried to just show up, but likeโ€ฆ would someone notice? would it error during transaction?

just stepped ankle-deep into someone's edge case there