Hello, the archives have now released. This thread is for bug reports, suggestions, etc. 107 posts and 19 image replies omitted. Click reply to view.
1. Its in beta, nothing is permanent on it for now (except the threads)
2. Things are still being worked on, expect bugs. All of this was written within the last 24 hours.
3. Only /soy/ is going to be archived FOR NOW, once we're satisfied with an initial release, we will expand it to all the boards.
4. The archive updates hourly, threads wont appear until an hour after their made. This will help us prevent CP from being archived.
>why are the images in their own row?
its for multi image support. ill make it look better later
>what order are they sorted in?
from creation date, newest to oldest
>will you add x feature?
Probably yes, let me emphasize its in beta now, its not done yet, not even close.
because it's full of dust
/soy/ is dusty tooARCHIVE >>>/nate/ NOW!
Gemmier than /soy/
If the archive grabs everything posted as it's posted then what's the point of having the delete post feature on?
the only reason its on is because people forced kuz
Thank you very much for this suggestion. We are currently scraping the in-tact API of that site, and its restored hunrdeds of threads from that era. This has greatly expanded our archives use, so thank you.https://sp.logwarehouse.net/read.cgi/suggest/6784https://sp.logwarehouse.net/read.cgi/suggest/8908
some old threads we've been able to archive because of this.
^ tranny moment
Here's another backup from another timehttp://126.96.36.199/
Also is there any chance threads from the wayback machine can be added?
>>40210>Good job.>Here's another backup from another time>http://188.8.131.52/
Thanks, I'll add it too. These are greatly appreciate, as each one add upwards of 800 old thread to our archive.
>Also is there any chance threads from the wayback machine can be added?
No, the only reason these worked is because the api is completely in tact and independent from the DB. However, cloudflare messes with archived files, and archived API's probably dont even exist, so it would require significant extensions. If some break through appears that does allow this, we will post that announcement here
FoolFooka is shit according to the desuarchive devs
yeah its straight up doodoo. this kuz shit is probably better for our needs
foolfuuka is over engineered, kuz software is (generally) extremely simple and effective
FoolFuuka was being replaced by wakarimasen devs, but unfortunately wakarimasen died. But it's shitty software, writing from scratch is a better option.NOW ARCHIVE /nate/
Seeing that this is a first party archive and there is access to the backend, does Archiva also grab private data that Vichan stores, such as IP addresses?
he will never reveal that information
I wish asking for /nate/ to be archive gave me 50 cents
He should probably just partially open-source it, mainly when it comes to the back-end (I.e. the actual scraper, but modified to not specifically target the sharty at first), and a very simplistic version of the current front-end. That way, it’d just be a generic vichan archiver, and not really a sharty one
Would be great for archiving other altchans
you faggot cocksucker nigger monkey search doesn't work and so many threads I was looking for don't even exist
I went to each page searching for a thread that had "saving" in it, and I found nothing
fix it you log eater
he said it uses the vichan api to scrape for threads. you can write one in maybe 50 lines of python
what purpose would this serve
There's a problem with spoilered files where sometimes they don't get archived, there's also a problem with the >>>/nate/
Will the archive crawl Yandex's caches?
Yandex has a HUGE backlog of threads (on the .ru domain) from the soot era that are not saved anywhere else.
i want to access the archive without any images>>39736
come on janny, plz do it
not him but you should consider archiving >>>/nate/
it's filled with dust that no one wants.
ok im finna be straight wif you fam, i want to scrape the whole thing for le heckin datahoarding
everyone wants to see it archived though
no wants want that, except you.
i do you retarded tranny
says the who wants a board made specifically made de get rid of tranny nas garbage
add a json api
make it load faster
Retarded. If open-sourcing wasn't competitive every big tech company wouldn't do it