Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
ACM Makes Thousands of Research Articles Freely Available; Opens First 50 Years (acm.org)
268 points by kulandai on June 12, 2022 | hide | past | favorite | 37 comments


If anyone is curious, here’s the previous discussion:

https://news.ycombinator.com/item?id=31447465

“ACM Digital Library Archive is Open Access with 50 Years of Published Records” (acm.org)

473 points by yarapavan 23 days ago | 51 comments


And here's the original discussion at the time of the announcement:

https://news.ycombinator.com/item?id=30944881

    ACM Opens First 50 Years Backfile
    177 points by mitchbob 66 days ago | 49 comments


I interviewed ACM's CEO recently on this topic, which has been shared previously: https://associationsnow.com/2022/05/the-way-things-were-why-...

Quick note: They are planning on opening the whole archive in a few years but it is still a work in progress. The plan is for 2025 for the whole archive to be opened. They released the first 50 years now in part because it timed well to the 75th anniversary.

Finally, the org's 75th anniversary livestream took place on Friday, more info about it is here: https://www.acm.org/75-celebration-event


Thanks for the article. I've wondered why they need to open access to these papers in tranches. I had imagined there is probably work to prepare each paper and possibly securing agreements?

> “Our goal is to have it open in a few years, but there’s very real costs associated with [the open-access work],” Hanson said. “We have models so that we can pay for it.”

This sort of confuses me even more. Is the "paying for it" talking about the work needed to be done to make each paper open-access, or is this talking about funding the ability to host the papers?

Just curious about what the costs are that are mentioned and the plan to cover those costs and how the results in needing to open access to papers in tranches. Maybe the model could be adopted by other journals, too?


While I don't want to speculate for ACM, I do know a lot about the association space and their business models as I have reported on this space for a decade. I would suggest that in a situation like this, "paying for it" probably refers to:

- Accounting for the loss in revenue from changing a service that was once commercial (as this would be an example of a loss of a "nondues revenue," something a lot of organizations are actually trying to increase so as to cut back their reliance on membership dues)

- Potential rights considerations (there are magazine articles with photos in that archive, for example, and many of those might have contracts/ownership considerations to work through)

- The labor of actually managing the transition (with something like this you wouldn't want to just run a script to open it up, you probably would have to double-check things, and there are simply a lot of things). They have to budget for any employee work used to handle this transition.

They emphasized to me in our interview, however, that the issue was not technical; I actually thought it might be given the amount of content there is, but as they're quoted as saying in the piece, the real work centered around the creative decision-making around actually making a call like this.


This is great news, I have been rather annoyed in the past when I wanted to share stuff that were on the archive publicly. The last such example I can remember was Peter Naur's reasoning that it should not be computer science, but Datalogy: https://dl.acm.org/doi/10.1145/365719.366510


In some languages it is datalogy :). For example Swedish: https://sv.wikipedia.org/wiki/Datavetenskap


Peter Naur was Danish, and was successful in getting the term datalogy established in the Scandinavian languages. Although in recent years most institutions have stopped using it in English materials and adopted "computer science", I guess for greater international visibility. For example, the department Naur established at the University of Copenhagen, the Datalogisk Institut [1], is now branded as the Department of Computer Science [2] in English. Whereas when Naur was still running it, he called it the Datalogy Institute.

[1] https://di.ku.dk/

[2] https://di.ku.dk/english/


A lot of the digital libraries, open data sets & code repositories suffer the same (quite hard) problem: a lack of finding aids for humans. Just greet the user with the same boolean advanced search fields. ACM, IEEE, NASA-ADS, Arxiv, Genbank, etc. All are aware of open access and "researcher development", but no one has a better solution than cloud based capsule model ;)


Anyone else's company's employment agreement prevent them from join a standards body without permission?


ACM is a professional organization, not a standards body.


Nope. I've even had them pay my membership costs for ACM and IEEE and various SIGs or Societies within them. Why wouldn't they want you to participate, even passively, in these organizations?


Because they do not want you to submit their proprietary tech as an open standard. In the extreme, think about submitting .NET to ECMA. Something lighter might be a proprietary image format or interchange method. Engineers may find this useful. It may be at the request of a significant customer. It may even be in the best interest of the company. It does not mean senior management will agree though.


MAANG+M. They want control.


I meant ANSI, ITU, ISO standards committee involvement or publishing RFCs on your own time without their name attached to yours. I can understand not using their name without permission.


Where are you from? Is it common for an employer to be able to regulate what you do when you're not on the clock?


RIP Oxide computer company. Cantrill's productivity is about to go to zero for a while.


Ha -- I've actually been a Digital Library subscriber for a long time due to my association with ACM Queue, and tried in vain for years to get the ACM to do something like this. I did convince them years ago to let selected practitioner DL subscribers "unlock" interesting articles, which I promptly did myself with one of my all-time favorite articles[0] -- but ACM didn't publicize the program (or really even mention it?) and to the best of my knowledge, that article was the only one made available.

[0] http://dtrace.org/blogs/bmc/2008/07/18/revisiting-the-intel-...


> RIP Oxide computer company. Cantrill's productivity is about to go to zero for a while.

What’s the relation?


Well, it seems not seeing this response in a timely manner has worked out for me as Bryan himself responded already but the joke was that he's often talked about being sidelined by doing journal study and/or obsessing over old manuals and papers. He also gave a paperswelove[0] on two papers in 2016, one of which is actually in the ACM library[1], and was involved with systemswelove[2].

Bryan, if you're reading this, I'm not a stalker. Apparently being accused of making a "submarine post" makes for a thorough reply, from me at least, even never having heard the term before.

  [0] https://paperswelove.org/2016/video/bryan-cantrill-jails-and-solaris-zones/   
  [1] https://dl.acm.org/doi/10.5555/1052676.1052707   
  [2] https://www.youtube.com/watch?v=QjVkCfgsLQ4]
incidentally, if anyone knows how to get hn to put my citations on separate lines without indenting do let me know, I tried a few different markdownisms but this was the best I could do.


A submarine post for an unrelated company


Sadly I think this announce came around the time they relationship that gave you access to O'Reilly library was torn up. I suspect a lot of people won't be renewing ACM membership just on economic grounds


Awesome! Thanks ACM!


On a related subject, does anyone know of a simple technique to download and save large quantities of all publicly available research articles?


Simple technique and large quantities are often not compatible... Check out the work of Internet Archive Scholar at https://scholar.archive.org. Their code is open source https://github.com/internetarchive/fatcat-scholar


Write a scraping program with appropriate throttling to be nice to providers with beautifulsoup on python. I have done this fairly successfully.


Since it is available for free online, why not just download what you need? Their site has nice suggestions for related works anyway -- managing a big heap o' papers seems like a pain.


I look forward to someone creating a curated list of particularly influential papers over the years.


What are some precious information in ACM.org that is not possible to get another place? I mean in this age where everything is so public. Or not? I used to have a 1 year free access to all of ACM resources. This was a gift. But I did not to anything. Maybe I lost an opportunity.


(edit: almost) everything can be found on sci hub.


Not everything. A lot. But stuff published this year still hasn't made its way there. And not all social sciences.

source: dissertating

(When I need something I can't get through my library, #icanhazpdf, find on scihub, or other googley means; I just email the author. Though that can take some time.)


In my experience, the only hole in shadow libraries is that they lack scans of old papers on non-trendy subjects. Interestingly, it seems the humanities don't have analogously useful sites.


What if you needed to link to a paper on your company's website? You'd link to sci-hub?


I think you are overestimating how available high-quality information. Low quality, misinformation, disinformation - that is all widely available.


Thank you! That is all.


scihub?


[flagged]


[flagged]


Could you not? Please?

Take a gander at the site guidelines linked in the footer.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: