📚 node [[axol]]

Eventually I'll write about it here

Table of Contents

Motivation

I don't necessarily want to read everything found by 'scott alexander', but it's still interesting to run search to see the overlap between people?

why not hnrss?

it's very likely more convenient to use if you only want a few HN queries, and don't care about historic ones

why axol over rss bridges?

rss is awesome! downsides

  • might be trickier to do various post-filtering
  • with axol, you can compare results across queries (user summary)
  • can be used with promnesia maybe

motivation: I don't understand how google search alerts work. e.g. try on openbci query (see my old emails from google alert)

[2019-12-30] Ask HN: Do you still use RSS? | Hacker News [[rss]] [[toblog]]

https://news.ycombinator.com/item?id=21913598

I've just started using Feedbin about a month ago, and although my HN firehose feed is at like 1100 something, it definitely limits the rest of the HN feeds. The show and ask feeds are both stuck around 400 something.

[2018-11-24] problems with diff approach

random errors, resulting in empty diff

small differences in output (e.g. google search)

not always interested in items disappearing from query

the downside – having to keep the state :(

TW at [2018-10-02] А кто-нибудь знает тулы типа https://t.co/EbpbNZWQFC , но чтобы туда можно было вбить грубо говоря любую поисковую кверю, или API (например reddit/github); и оно отслеживало результаты?

[2019-10-02] huh

[2020-06-22] I feel the same. So many cool things I'd love to learn about, but not enough tim… | Hacker News [[pkm]] [[axol]]

I feel the same. So many cool things I'd love to learn about, but not enough time.

Similar/existing projects

[2020-03-03] Show HN: Mailbrew – Automated Email Digests from HN, RSS, Reddit, Twitter

https://mailbrew.com/

[2019-12-26] awesome-selfhosted/awesome-selfhosted: A list of Free Software network services and web applications which can be hosted locally. Selfhosting is the process of hosting and managing applications instead of renting from Software-as-a-Service providers

https://github.com/awesome-selfhosted/awesome-selfhosted

Search Engines

trackreddit only two subscriptions

wanted lifelogging
trackreddit

tool to search on reddit or even custom services? special ordering ('least likely' for showing least occuring subreddits). could also do it on rust? [[pkm]]

searched as 'keyword monitoring tool'
tried searching on reddit, but nothing really useful..
https://github.com/trulia/thoth – unclear what it's doing

keyword tracking (SERP) – not sure if an overkill..

[2018-11-06] just implement a provider for kibitzr?

rust?

* Make it more user friendly

add axol doctor config [[project]]

also axol doctor to check individual providers + reuse in tests

rely on user config dirs

provide an asci diagram for crawler + report + feed reader?

* Blacklisting

maybe button to ban user? it would write to config or something? maybe I can even use some public API constructor?

I suppose pouchdb would be perfect for blacklisting [[couchdb]]

for blacklisting, instead could just apply custom per-user classes? or even edit them. that would allow to highlight properly

yeah, blacklisting could both update backend and hide locally

[2019-04-15] axol results for redditpkm, rendered at Fri 12 Apr 2019 05:07

redditpkm.html
shit. need to ignore the weapons subreddits
I think generally, my tools needs to have a database…

shit, top lifelogging tweets are on japanese… [[twitter]]

would be interesting to ignore links I already visited from results. It can even be done automatically…. [[promnesia]] [[axol]]

huh, quite a few bots on reddit? [[reddit]]

azncbot
bprogramming even maybe?
autotldr
tabledresser

huh, lots of stuff from twitter is just garbage. need a good way of suppressing it… [[axol]] [[twitter]]

[2020-01-01] twittermypy (211) - Miniflux

https://axol.karlicoss.xyz/feed/53/entries

/mypy1031

[2020-01-01] twittermypy (211) - Miniflux

https://axol.karlicoss.xyz/feed/53/entries

/aymk_mypy/status/1211970059205107712 All
twitter_mypy 7 hours ago Original @Witch_Astaroth みどりさん!この垢にしてから相互になった方の中では割と話せたと思ってます笑 来年もよろしくお願いします!

[2020-01-01] twittermypy (111) - Miniflux

https://axol.karlicoss.xyz/feed/53/entries

/mypy2424/status/1211845733210443778 All
twitter_mypy 7 hours ago Original 事実でも噂でも、クズとかいうやつお前はその人より努力してからいえよな〜って思うよ!!!!! 好きな

[2020-01-01] twittermypy (111) - Miniflux

https://axol.karlicoss.xyz/feed/53/entries

/soe1113/status/741281801323175936 All
   twitter_mypy 7 hours ago O

[2020-01-03] twitterlifelogging (20) - Miniflux

https://axol.karlicoss.xyz/feed/52/entries

/jager_atami/status/24390787028 All
twitter_lifelogging 2 days ago Original #udetate #lifelogging 陶房で壺割り 12 個 201

[2020-01-03] twitterquantifiedself (36) - Miniflux

https://axol.karlicoss.xyz/feed/55/entries

/hiperesoterismo/status/1212803558203985920 All
    twitter_quantified_self 4 hours ago Original mis únicos 4 moodspic.twitter.com/5RgPiKKhMx

What would be a good UI for axol?

I really need some sort of proper frontend browser for it…

would be nice to have some html dashboard, so it's easy to blacklist terms?

need a UI to easily add items to axol. e.g. Alexei Kitaev

maybe some simple cmdline available from anywhere. or org mode as source?

use metabase or something? could use a column to mark as seen? would be much easier than rss

dunno about rss interface… really need a more efficient way of processing content, reordering, etc

Queries

search for 'data export' or something?

[2019-12-07] not much on reddit for 'data liberation:

[2020-03-10] 'data export' looks promising on github

[2020-01-12] github.com/karlicoss - Twitter Search / Twitter [[self]]

https://twitter.com/search?q=github.com%2Fkarlicoss&src=typed_query&f=live

[2020-03-10] right, it looks quite reasonable to have

  • [2020-11-30] very few results though

[2020-03-30] All | Search powered by Algolia

https://hn.algolia.com/?dateRange=all&page=0&prefix=true&query=github.com%2Fkarlicoss&sort=byPopularity&type=story

[2020-01-30] my. package | beepb00p [[postprivacy]] [[qs]] [[toread]]

https://beepb00p.xyz/mypkg.html

Interesting experiment! Thanks for sharing :-) You might find this person's musings about such experiments interesting: https://www.plomlompom.de/index.en.html#topic_postprivacy

[2020-03-01] axol it

[2019-02-15] What Universal Human Experiences Are You Missing Without Realizing It? | Slate Star Codex [[mind]]

search this post on reddit or something

[2019-04-22] actually even found something interesting on gh..

https://github.com/search?q=what-universal-human-experiences-are-you-missing-without-realizing-it&type=Code
although, it's code search, not repo search

[2019-04-22] so trying to google that query

if looking for past month, that basically results in random keywords
what universal human experiences are you missing without realizing it

[2019-06-13] yeah, twitter feed is not too huge, so could subscribe to it

[2019-06-29] https://github.com/hypotext/notation - Twitter Search

https://twitter.com/search?q=https%3A%2F%2Fgithub.com%2Fhypotext%2Fnotation&partner=Firefox&source=desktop-search

[2019-08-09] axol this?

  • [2019-08-25] or aaxol for twitter? although doesn't seem to be posted often

[2020-01-09] karlicoss/cachew - Twitter Search / Twitter [[cachew]]

https://twitter.com/search?q=karlicoss%2Fcachew&partner=Firefox&source=desktop-search

[2020-08-24] All | Search powered by Algolia Noon Universe search

mypy – exclude mypython; prioritize topics [[mypy]]

sleep tracking [[sleep]] [[qs]]

add bret victor? [[bretvictor]]

[2019-06-13] uh. need a proper interface for it

  • [2019-06-13] what's the quickest possible way to create guis? still gonna be python config, right? perhaps self-checking!

    • [2019-06-15] ok, just main function sounds ok..

ted chiang – pretty nice to search on twitter [[tedchiang]]

complex numbers group; argonov; transhumanism? [[argonov]]

[2019-06-15] youtube.com/watch?v=YrXk2buqsgg

can find some interesting stuff on twitter..

[2019-07-28] "виктор аргонов" got some good results on twitter

kobo; spaced repetition? [[spacedrep]]

[2019-12-07] eh, kobo not so interesting..

[2018-08-25] scott alexander unsong - Twitter Search

https://twitter.com/search?f=tweets&vertical=default&q=scott%20alexander%20unsong&src=typd&lang=en-gb

could add this to my twitter poller thing (again, via API) or kibitzr?

karlicoss! [[self]]

[2019-06-15] doesn't look much on pinboard…

[2019-12-07] not much interesting

cancel scott alexander search alert

set up alerts for nutrition stuff

add "lagrangian mechanics"??? [[lagrangian]]

[2020-11-30] or 'Hamiltonian'? at least on HN

[2020-03-09] #promnesia

GitHub - karlicoss/promnesia - Another piece of your extended mind

search on pinboard? or even axol..

kedr livansky [[kedr]]

exobrain? [[exobrain]]

[2020-05-01] Pinboard bookmarks tagged eeg

[2020-05-01] Pinboard bookmarks tagged km [[pkm]]

memex? esp github [[memex]]

george hotz?

add mypy to search??

[2019-10-01] tried aaxol for

"pocket export"

"data liberation"

pkm for twitter can probably be removed…

initial query… [[mypy]]

mypy -from:mypy2424 -from:mypy1031 -from:aymkmypy -to:aymkmypy -from:mypy0229

ugh, not sure how convenient it'd be to filter this shit

cleanup 'extended mind' – certainly lots of crap in the database [[twitter]]

hmm, beepb00p.xyz isn't resolving anything? [[self]] [[twitter]]

[2019-12-02] axol results for hackernewspkm, rendered at 02 Dec 2019 11:05

axol/summary/hackernewspkm.html

Personal Knowledge database

[2019-12-02] axol results for hackernewspkm, rendered at 02 Dec 2019 11:05

axol/summary/hackernewspkm.html

Personal knowledge base

subscribe to more news on QS, BCI and gadgets [[qs]]

  • State "DONE" from "STRT" [2019-04-22]

regular?

brain-computer interface [[bci]]

Sources

wonder if I could search among hypothesis users… [[hypothesis]]

[2019-06-15] eh, search is a bit weird…

could add google search too I suppose.. but that's def lowest priority

implement for reddit. release reddit/github searchers (as library, then import and use)

youtube? could search quantified self at least

[2019-07-20] eh, tried few queries and does't look that result appear that often…

World be great to search in comments [[axol]] [[reddit]]

hypothesis

[2019-07-28] not that many results on pkm/quantified self..

[2019-07-28] more on spaced repetition and ted chiang

[2019-07-28] Schedule - pushshift.io

https://pushshift.io/schedule/

Current Schedule
    April comments should be available around May 20 ,2018.

[2019-07-28] New API endpoint – Now you can search comments! : redditdev

https://www.reddit.com/r/redditdev/comments/3fv8vv/new_api_endpoint_now_you_can_search_comments/

New API endpoint -- Now you can search comments!

for google search, only notify about new results; not about changes. wonder how?

[2019-12-28] Search Reddit Comments by User

https://redditcommentsearch.com/

Search through comments of a particular reddit user.

[2020-01-11] pushshift/api: Pushshift API

https://github.com/pushshift/api

duckduckgo?

[2019-12-01] Pushshift Reddit Search [[reddit]] [[scrape]]

https://redditsearch.io/?term=beepb00p.xyz&dataviz=false&aggs=false&subreddits=&searchtype=posts,comments&search=true&start=0&end=1575221715&size=100

[2019-12-15] hacker-news-favorites-api/main.js at master · reactual/hacker-news-favorites-api

https://github.com/reactual/hacker-news-favorites-api/blob/master/src/main.js

const x = require('x-ray')()

hmm, it's got 'paginate'?

[2020-05-18] Hypothesis

eh need to run orger I guess? or axol!

could run HN more often [[hackernews]]

also use more generic hooks?

[2020-05-03] [[https://grep.app/search?q=import%20my%5C..%2A%24&regexp=true&filter[lang][0]=Python][import my\..*$ - grep.app]]

CI/testing

HN is very quick, so prob really good to test on (even on CI)

Sort tags by number of total occurences?

Use cachew and keep stuff as blobs with id [[cachew]]

Not sure if I should overwrite or update? Could decide later and query with unique ids to start with?

warn when there are too many atom items?

suppress some feeds in the config?

[2020-11-21] Show HN: I made an alternative to Google Alerts that listens to social media

https://www.pmalerts.com/

[2020-12-05] eh, demands to register etc

shit, seems that the timestamps are wrong and also I got the link wrong

might need to work on this: axol/databases/twitterextendedmind.sqlite

Maybe record a video on the phone ? [[demo]]

maybe check crawled pinboard users for interesting tags/links?

[2019-06-15] yeah, need to make this bit more effecient..

maybe, summary and 'rendered' are really sort of the same page? just different sorting…

Def interesting to see user stats

Sort tags by number of total occurences?

Maybe better way of normalising? E.g. look at tedchiang and gq article. Display 'bumped' entries separately? Like a different way of sorting

prepend # in tag?

could search for interesting tags occurence without them actually being scraped

might be good to do some sort of fuzzy grouping?

wonder what's an effecient way of doing it? sort of similarity connected components?
/TheGoogleDotCom/status/915750443275444226
Can Google's AI-powered Clips make people care about lifelogging? - TechCrunch http://ift.tt/2wyk69G
2017-10-05 01:28 by TheGoogleDotCom
/gauravndhankar/status/915750414774972416
Can Google’s AI-powered Clips make people care about lifelogging? http://dlvr.it/PsRpwK pic.twitter.com/IAPiiqacKo
2017-10-05 01:28 by gauravndhankar
/animesh1977/status/915749491344596992
Can Google’s AI-powered Clips make people care about lifelogging? http://ift.tt/2xUwbaz

would be interesting to have explorer for users that looks for some relevant taks/keywords? [[pinboard]]

Hmm also need real-time search and notify I guess? [[hackernews]]

Eh, better idea would be a tag subscription… [[mypy]]

would be nice to have some efficient frontend + backend thing [[timeline]]

[2019-12-02] hmmm. actually could do it in a twitter account??

[2019-12-04] could ask on HN? [[outbox]]

[2019-12-04] or RSS? https://github.com/awesome-selfhosted/awesome-selfhosted#feed-readers

[2019-12-24] Edit Feed: beepb00p.xyz - Miniflux

https://axol.karlicoss.xyz/feed/56/edit

Scraper Rules
Rewrite Rules
Title Filter
Content Filter

[2019-12-24] Command Line Usage - Documentation

https://miniflux.app/docs/cli.html

miniflux -config-file /etc/miniflux.conf

could make a filter to release items slowly? e.g. tweets with more than 10 likes, if update pops it up, then it ends up in the feed. although I need 'processed' entries

[2020-05-27] Axol: Personal automatic news feed – crawl Reddit/Twitter/HN and read as RSS | Hacker News

perhaps redefine everything in entities? and have relations – people, subreddits, urls, tags, etc

rename adhoc to 'search'?

think about a special tag to mark stuff that should be autoimported in a similar manner my kibitzr thing worked

some todos

  • move individual data sources to files within the repo.. not even submodules, too much hassle
    if someone needs, they can just import axol.sources.src directly
  • cleanup the json shit.. ideally use some proper library
  • not sure what to do with RSS feeds.. but could start with HTML report generation
  • query language:
    might be better to adopt
    service:sub:query
    e.g.
    pinboard:tag:whatever
    or
    github:some query
    not sure what to do with colons though.. but maybe think about this later. most won't support searching them anyway

def should keep original results in the DB as far as possible

to start with, only support exact queries? e.g. demand them in queries and mention that support for fuzzier might be added later

think about multiple small databases vs one huge?

multiple small:

  • easier to mess with/explore
  • easier concurrency
  • easier to remove from reports (although for that need to make sure it's really 1-1 correspondence with source and query? dunno)

single db:

  • easier to bulk clean/somewhat easier to bulk normalise
    although this would be kind of useless if I store raw json outputs
  • easier to do queries across multiple (e.g. associating users?)

thinking about query language

how it could look in adhoc mode
github:'scott alexander' twitter:'scott alexander'

in config, allow something nicer like
[twitter,github,reddit]:'scott alexander'

or [twitter,github,reddit, pinboard]:['scott alexander', 'quantified self']
pinboard:tag:scottalexander

  • NOTE: echo twitter:'scott alexander' – this is gonna get swallowed by bash… suggest to always quote?
  • NOTE: treat " and ' the same? twitter does it…
  • TODO: make sure that query parsing is defensive

for people to try it out it really needs a simplest service possible they can run with docker? ideally without auth etc

Track most active pinboard users? They might have interesting other stuff

[2019-07-20] maybe, try to intersect known user's tags and see what they got in common?

running under docker results in /app/axol/js/sorttable

use different font?

might need two pass algorithm? One for crawling, second for filtering?

e.g. I crawled quite a bit of pokemon crap, would be good to filter it?

related [[pkm]] [[search]] [[degoogle]]

[2019-04-15] Pinboard: network for karlicoss [[pinboard]] [[axol]]

https://pinboard.in/network/
shit… too many tweets. I need a way to filter the network…

[2021-01-16] in fact it's the most common request to pinboard author apparently

spinboard: something's not right. e.g. try

querying t:quantified-self
https://pinboard.in/t:quantified-self

spinboard gives 220 total results. however, on the first page there are 50…
scraper is missing something?

eh. sooo, there are no dupes even!! BS4 actually sees only 20 per page (pinboard still gives us '50' in the next url).
whereas chrome does show up 50 entries; but if you go to the second page they are gonna overlap.

must be some pinboard bug?? [[pinboard]]

[2019-11-06] classes — classes 0.1.0 documentation

https://classes.readthedocs.io/en/latest/

[2020-02-15] hmm, somethihg I was trying to do in axol?… [[axol]]

doesn't look active. all top results are from 2017 [[axol]] [[upspin]]

[2019-09-04] ScriptSmith/socialreaper: Social media scraping / data collection library for Facebook, Twitter, Reddit, YouTube, Pinterest, and Tumblr APIs [[reddit]] [[scrape]] [[axol]]

https://github.com/ScriptSmith/socialreaper

Reddit
Get the top 10 comments from the top 50 threads of all time on reddit

[2020-05-16] ok, seems to be using real APIs, so overall I'm skeptical. but it's got a nice panel for tokens [[exports]] [[jdoe]]

pruning – for now via sqlitedbbrowser? make sure it locks the db? [[axol]]

📖 stoas
⥱ context