Wikipedia:Village pump (proposals)
Policy | Technical | Proposals | Idea lab | WMF | Miscellaneous |
The proposals section of the village pump is used to offer specific changes for discussion. Before submitting:
- Check to see whether your proposal is already described at Perennial proposals. You may also wish to search the FAQ.
- This page is for concrete, actionable proposals. Consider developing earlier-stage proposals at Village pump (idea lab).
- Proposed policy changes belong at Village pump (policy).
- Proposed speedy deletion criteria belong at Wikipedia talk:Criteria for speedy deletion.
- Proposed WikiProjects or task forces may be submitted at Wikipedia:WikiProject Council/Proposals.
- Proposed new wikis belong at meta:Proposals for new projects.
- Proposed new articles belong at Wikipedia:Requested articles.
- Discussions or proposals which warrant the attention or involvement of the Wikimedia Foundation belong at Wikipedia:Village pump (WMF).
- Software changes which have consensus should be filed at Phabricator.
Discussions are automatically archived after remaining inactive for nine days.
RfC: Log the use of the HistMerge tool at both the merge target and merge source
[edit]Currently, there are open phab tickets proposing that the use of the HistMerge tool be logged at the target article in addition to the source article. Several proposals have been made:
- Option 1a: When using Special:MergeHistory, a null edit should be placed in both the merge target and merge source's page's histories stating that a history merge took place.
- (phab:T341760: Special:MergeHistory should place a null edit in the page's history describing the merge, authored Jul 13 2023)
- Option 1b: When using Special:MergeHistory, add a log entry recorded for the articles at the both HistMerge target and source that records the existence of a history merge.
- (phab:T118132: Merging pages should add a log entry to the destination page, authored Nov 8 2015)
- Option 2: Do not log the use of the Special:MergeHistory tool at the merge target, maintaining the current status quo.
Should the use of the HistMerge tool be explicitly logged? If so, should the use be logged via an entry in the page history or should it instead be held in a dedicated log? — Red-tailed hawk (nest) 15:51, 20 November 2024 (UTC)
Survey: Log the use of the HistMerge tool
[edit]- Option 1a/b. I am in principle in support of adding this logging functionality, since people don't typically have access to the source article title (where the histmerge is currently logged) when viewing an article in the wild. There have been several times I can think of when I've been going diff hunting or browsing page history and where some explicit note of a histmerge having occurred would have been useful. As for whether this is logged directly in the page history (as is done currently with page protection) or if this is merely in a separate log file, I don't have particularly strong feelings, but I do think that adding functionality to log histmerges at the target article would improve clarity in page histories. — Red-tailed hawk (nest) 15:51, 20 November 2024 (UTC)
- Option 1a/b. No strong feelings on which way is best (I'll let the experienced histmergers comment on this), but logging a history merge definitely seems like a useful feature. Chaotic Enby (talk · contribs) 16:02, 20 November 2024 (UTC)
- Option 1a/b. Choatic Enby has said exactly what I would have said (but more concisely) had they not said it first. Thryduulf (talk) 16:23, 20 November 2024 (UTC)
- 1b would be most important to me but but 1a would be nice too. But this is really not the place for this sort of discussion, as noted below. Graham87 (talk) 16:28, 20 November 2024 (UTC)
- Option 2 History merging done right should be seamless, leaving the page indistinguishable from if the copy-paste move being repaired had never happened. Adding extra annotations everywhere runs counter to that goal. Prefer 1b to 1a if we have to do one of them, as the extra null edits could easily interfere with the history merge being done in more complicated situations. * Pppery * it has begun... 16:49, 20 November 2024 (UTC)
- Could you expound on why they should be indistinguishable? I don't see how this could harm any utility. A log action at the target page would not show up in the history anyways, and a null edit would have no effect on comparing revisions. Aaron Liu (talk) 17:29, 20 November 2024 (UTC)
- Why shouldn't it be indistinguishable? Why it it necessary to go out of our way to say even louder that someone did something wrong and it had to be cleaned up? * Pppery * it has begun... 17:45, 20 November 2024 (UTC)
- All cleanup actions are logged to all the pages they affect. Aaron Liu (talk) 18:32, 20 November 2024 (UTC)
- Why shouldn't it be indistinguishable? Why it it necessary to go out of our way to say even louder that someone did something wrong and it had to be cleaned up? * Pppery * it has begun... 17:45, 20 November 2024 (UTC)
- Could you expound on why they should be indistinguishable? I don't see how this could harm any utility. A log action at the target page would not show up in the history anyways, and a null edit would have no effect on comparing revisions. Aaron Liu (talk) 17:29, 20 November 2024 (UTC)
- 2 History merges are already logged, so this survey name is somewhat off the mark. As someone who does this work: I do not think these should be displayed at either location. It would cause a lot of noise in history pages that people probably would not fundamentally understand (2 revisions for "please process this" and "remove tag" and a 3rd revision for the suggested log), and it would be "out of order" in that you will have merged a bunch of revisions but none of those revisions would be nearby the entry in the history page itself. I also find protections noisy in this way as well, and when moves end up causing a need for history merging, you end up with doubled move entries in the merged history, which also is confusing. Adding history merges to that case? No thanks. History merges are more like deletions and undeletions, which already do not add displayed content to the history view. Izno (talk) 16:54, 20 November 2024 (UTC)
- They presently are logged, but only at the source article. Take for example this entry. When I search for the merge target, I get nothing. It's only when I search the merge source that I'm able to get a result, but there isn't a way to know the merge source.
- If I don't know when or if the histmerge took place, and I don't know what article the history was merged from, I'd have to look through the entirety of the merge log manually to figure that out—and that's suboptimal. — Red-tailed hawk (nest) 17:05, 20 November 2024 (UTC)
- ... Page moves do the same thing, only log the move source. Yet this is not seen as an issue? :)
- But ignoring that, why is it valuable to know this information? What do you gain? And is what you gain actually valuable to your end objective? For example, let's take your
There have been several times I can think of when I've been going diff hunting or browsing page history and where some explicit note of a histmerge having occurred would have been useful.
Is not the revisions left behind in the page history by both the person requesting and the person performing the histmerge not enough (see {{histmerge}})? There are history merges done that don't have that request format such as the WikiProject history merge format, but those are almost always ancient revisions, so what are you gaining there? And where they are not ancient revisions, they are trivial kinds of the form "draft x -> page y, I hate that I even had to interact with this history merge it was so trivial (but also these are great because I don't have to spend significant time on them)". Izno (talk) 17:32, 20 November 2024 (UTC)
I don't think everyone would necessarily agree (see Toadspike's comment below). Chaotic Enby (talk · contribs) 17:42, 20 November 2024 (UTC)... Page moves do the same thing, only log the move source. Yet this is not seen as an issue? :)
- Page moves do leave a null edit on the page that describes where the page was moved from and was moved to. And it's easy to work backwards from there to figure out the page move history. The same cannot be said of the Special:MergeHistory tool, which doesn't make it easy to re-construct what the heck went on unless we start diving naïvely through the logs. — Red-tailed hawk (nest) 17:50, 20 November 2024 (UTC)
- It can be *possible* to find the original history merge source page without looking through the merge log, but the method for doing so is very brittle and extremeley hacky. Basically, look for redirects to the page using "What links here", and find the redirect whose first edit has an unusual byte difference. This relies on the redirect being stable and not deleted or retargetted. There is also another way that relies on byte difference bugs as described in the above-linked discussion by wbm1058. Both of those are ... particularly awful. Graham87 (talk) 03:48, 21 November 2024 (UTC)
- In the given example, the history-merge occurred here. Your "log" is the edit summaries. "Created page with '..." is the edit summary left by a normal page creation. But wait, there is page history before the edit that created the page. How did it get there? Hmm, the previous edit summary "Declining submission: v - Submission is improperly sourced (AFCH)" tips you off to look for the same title in draft: namespace. Voila! Anyone looking for help with understanding a particular merge may ask me and I'll probably be able to figure it out for you. – wbm1058 (talk) 05:51, 21 November 2024 (UTC)
- Here's another example, of a merge within mainspace. The automatic edit summary (created by the MediaWiki software) of this (No difference) diff "Removed redirect to Jordan B. Acker" points you to the page that was merged at that point. Voila. Voila. Voila. – wbm1058 (talk) 13:44, 21 November 2024 (UTC)
- There are times where those traces aren't left. Aaron Liu (talk) 13:51, 21 November 2024 (UTC)
- Here's another scenario, this one from WP:WikiProject History Merge. The page history shows an edit adding +5,800 bytes, leaving the page with 5,800 bytes. But the previous edit did not leave a blank page. Some say this is a bug, but it's also a feature. That "bug" is actually your "log" reporting that a hist-merge occurred at that edit. Voila, the log for that page shows a temp delete & undelete setting the page up for a merge. The first item on the log:
- @ 20:14, 16 January 2021 Tbhotch moved page Flag of Yucatán to Flag of the Republic of Yucatán (Correct name)
- clues you in to where to look for the source of the merge. Voila, that single edit which removed −5,633 bytes tells you that previous history was merged off of that page. The log provides the details. – wbm1058 (talk) 16:03, 21 November 2024 (UTC)
- (phab:T76557: Special:MergeHistory causes incorrect byte change values in history, authored Dec 2 2014) — Preceding unsigned comment added by Wbm1058 (talk • contribs) 18:13, 21 November 2024 (UTC)
- Again, there are times where the clues are much harder to find, and even in those cases, it'd be much better to have a unified and assured way of finding the source. Aaron Liu (talk) 16:11, 21 November 2024 (UTC)
- Indeed. This is a prime example of an unintended undocumented feature. Graham87 (talk) 08:50, 22 November 2024 (UTC)
- Yeah. I don't think that we can permanently rely on that, given that future versions of MediaWiki are not bound in any real way to support that workaround. — Red-tailed hawk (nest) 04:24, 3 December 2024 (UTC)
- Indeed. This is a prime example of an unintended undocumented feature. Graham87 (talk) 08:50, 22 November 2024 (UTC)
- Again, there are times where the clues are much harder to find, and even in those cases, it'd be much better to have a unified and assured way of finding the source. Aaron Liu (talk) 16:11, 21 November 2024 (UTC)
- Here's another scenario, this one from WP:WikiProject History Merge. The page history shows an edit adding +5,800 bytes, leaving the page with 5,800 bytes. But the previous edit did not leave a blank page. Some say this is a bug, but it's also a feature. That "bug" is actually your "log" reporting that a hist-merge occurred at that edit. Voila, the log for that page shows a temp delete & undelete setting the page up for a merge. The first item on the log:
- There are times where those traces aren't left. Aaron Liu (talk) 13:51, 21 November 2024 (UTC)
- Support 1b (log only), oppose 1a (null edit). I defer to the experienced histmergers on this, and if they say that adding null edits everywhere would be inconvenient, I believe them. However, I haven't seen any arguments against logging the histmerge at both articles, so I'll support it as a sensible idea. (On a similar note, it bothers me that page moves are only logged at one title, not both.) Toadspike [Talk] 17:10, 20 November 2024 (UTC)
- Option 2. The merges are already logged, so there’s no reason to add it to page histories. While it may be useful for habitual editors, it will just confuse readers who are looking for an old revision and occasional editors. Ships & Space(Edits) 18:33, 20 November 2024 (UTC)
- But only the source page is logged as the "target". IIRC it currently can be a bit hard to find out when and who merged history into a page if you don't know the source page and the mergeperson didn't leave any editing indication that they merged something. Aaron Liu (talk) 18:40, 20 November 2024 (UTC)
- 1B. The present situation of the action being only logged at one page is confusing and unhelpful. But so would be injecting null-edits all over the place. — SMcCandlish ☏ ¢ 😼 01:38, 21 November 2024 (UTC)
- Option 2. This exercise is dependent on finding a volunteer MediaWiki developer willing to work on this. Good luck with that. Maybe you'll find one a decade from now. – wbm1058 (talk) 05:51, 21 November 2024 (UTC)
- And, more importantly, someone in the MediaWiki group to review it. I suspect there are many people, possibly including myself, who would code this if they didn't think they were wasting their time shuffling things from one queue to another. * Pppery * it has begun... 06:03, 21 November 2024 (UTC)
- That link requires a Gerrit login/developer account to view. It was a struggle to get in to mine (I only have one because of an old Toolforge account and I'd basically forgotten about it), but for those who don't want to go through all that, that group has only 82 members (several of whose usernames I recognise) and I imagine they have a lot on their collective plate. There's more information about these groups at Gerrit/Privilege policy on MediaWiki. Graham87 (talk) 15:38, 21 November 2024 (UTC)
- Sorry, I totally forgot Gerrit behaved in that counterintuitive way and hid public information from logged out users for no reason. The things you miss if Gerrit interactions become something you do pretty much every day. If you want to count the members of the group you also have to follow the chain of included groups - it also includes https://ldap.toolforge.org/group/wmf, https://ldap.toolforge.org/group/ops and the WMDE-MediaWiki group (another login-only link), as well as a few other permission edge cases (almost all of which are redundant because the user is already in the MediaWiki group) * Pppery * it has begun... 18:07, 21 November 2024 (UTC)
- That link requires a Gerrit login/developer account to view. It was a struggle to get in to mine (I only have one because of an old Toolforge account and I'd basically forgotten about it), but for those who don't want to go through all that, that group has only 82 members (several of whose usernames I recognise) and I imagine they have a lot on their collective plate. There's more information about these groups at Gerrit/Privilege policy on MediaWiki. Graham87 (talk) 15:38, 21 November 2024 (UTC)
- And, more importantly, someone in the MediaWiki group to review it. I suspect there are many people, possibly including myself, who would code this if they didn't think they were wasting their time shuffling things from one queue to another. * Pppery * it has begun... 06:03, 21 November 2024 (UTC)
- Support 1a/b, and I would encourage the closer to disregard any opposition based solely on the chances of someone ever actually implementing it. —Compassionate727 (T·C) 12:52, 21 November 2024 (UTC)
- Fine. This stupid RfC isn't even asking the right questions. Why did I need to delete (an expensive operation) and then restore a page in order to "set up for a history merge" Should we fix the software so that it doesn't require me to do that? Why did the page-mover resort to cut-paste because there was page history blocking their move, rather than ask a administrator for help? Why doesn't the software just let them move over that junk page history themselves, which would negate the need for a later hist-merge? (Actually in this case the offending user only has made 46 edits, so they don't have page-mover privileges. But they were able to move a page. They just couldn't move it back a day later after they changed their mind.) wbm1058 (talk) 13:44, 21 November 2024 (UTC)
- Yeah, revision move would be amazing, for a start. Graham87 (talk) 15:38, 21 November 2024 (UTC)
- Fine. This stupid RfC isn't even asking the right questions. Why did I need to delete (an expensive operation) and then restore a page in order to "set up for a history merge" Should we fix the software so that it doesn't require me to do that? Why did the page-mover resort to cut-paste because there was page history blocking their move, rather than ask a administrator for help? Why doesn't the software just let them move over that junk page history themselves, which would negate the need for a later hist-merge? (Actually in this case the offending user only has made 46 edits, so they don't have page-mover privileges. But they were able to move a page. They just couldn't move it back a day later after they changed their mind.) wbm1058 (talk) 13:44, 21 November 2024 (UTC)
- Option 1b – changes to a page's history should be listed in that page's log. There's no need to make a null edit; pagemove null edits are useful because they meaningfully fit into the page's revision history, which isn't the case here. jlwoodwa (talk) 00:55, 22 November 2024 (UTC)
- Option 1b sounds best since that's what those in the know seem to agree on, but 1a would probably be OK. Abzeronow (talk) 03:44, 23 November 2024 (UTC)
- Option 1b seems like the one with the best transparency to me. Thanks. Huggums537voted! (sign🖋️|📞talk) 06:59, 25 November 2024 (UTC)
Discussion: Log the use of the HistMerge tool
[edit]- I'm noticing some commentary in the above RfC (on widening importer rights) as to whether or not this might be useful going forward. I do think that having the community weigh in one way or another here would be helpful in terms of deciding whether or not this functionality is worth building. — Red-tailed hawk (nest) 15:51, 20 November 2024 (UTC)
- This is a missing feature, not a config change. Aaron Liu (talk) 15:58, 20 November 2024 (UTC)
- Indeed; it's about a feature proposal. — Red-tailed hawk (nest) 16:02, 20 November 2024 (UTC)
- As many of the above, this is a feature request and not something that should be special for the English Wikipedia. — xaosflux Talk 16:03, 20 November 2024 (UTC)
- See phab:T341760. I'm not seeing any sort of reason this would need per-project opt-ins requiring a local discussion. — xaosflux Talk 16:05, 20 November 2024 (UTC)
- True, but I agree with Red-tailed hawk that it's good to have the English Wikipedia community weigh on whether we want that feature implemented here to begin with. Chaotic Enby (talk · contribs) 16:05, 20 November 2024 (UTC)
- Here is the Phabricator project page for MergeHistory, and the project's 11 open tasks. – wbm1058 (talk) 18:13, 21 November 2024 (UTC)
- I agree that this is an odd thing to RFC. This is about a feature in MediaWiki core, and there are a lot more users of MediaWiki core than just English Wikipedia. However, please do post the results of this RFC to both of the phab tickets. It will be a useful data point with regards to what editors would find useful. –Novem Linguae (talk) 23:16, 21 November 2024 (UTC)
CheckUser for all new users
[edit]All new users (IPs and accounts) should be subject to CheckUser against known socks. This would prevent recidivist socks from returning and save the time and energy of users who have to prove a likely case at SPI. Recidivist socks often get better at covering their "tells" each time making detection increasingly difficult. Users should not have to make the huge effort of establishing an SPI when editing from an IP or creating a new account is so easy. We should not have to endure Wikipedia:Long-term abuse/HarveyCarter, Wikipedia:Sockpuppet investigations/Phạm Văn Rạng/Archive or Wikipedia:Sockpuppet investigations/Orchomen/Archive if CheckUser can prevent them. Mztourist (talk) 04:06, 22 November 2024 (UTC)
- I'm pretty sure that even if we had enough checkuser capacity to routinely run checks on every new user that doing so would be contrary to global policy. Thryduulf (talk) 04:14, 22 November 2024 (UTC)
- Setting aside privacy issues, the fact that the WMF wouldn't let us do it, and a few other things: Checking a single account, without any idea of who you're comparing them to, is not very effective, and the worst LTAs are the ones it would be least effective against. This has been floated several times in the much narrower context of adminship candidates, and rejected each time. It probably belongs on WP:PEREN by now. -- Tamzin[cetacean needed] (they|xe) 04:21, 22 November 2024 (UTC)
- Why can't it be automated? What are the privacy issues and what would WMF concerns be? There has to be a better system than SPI which imposes a huge burden on the filer (and often fails to catch socks) while we just leave the door open for LTAs. Mztourist (talk) 04:39, 22 November 2024 (UTC)
- How would it be automated? We can't just block everyone who even sometimes shares an IP with someone, which is most editors once you factor in mobile editing and institutional WiFi. Even if we had a system that told checkusers about all shared-IP situations and asked them to investigate, what are they investigating for? The vast majority of IP overlaps will be entirely innocent, often people who don't even know each other. There's no way for a checkuser to find any signal in all that noise. So the only way a system like this would work is if checkusers manually identified IP ranges that are being used by LTAs, and then placed blocks on those ranges to restrict them from account creation... Which is what already happens. -- Tamzin[cetacean needed] (they|xe) 04:58, 22 November 2024 (UTC)
- I would assume that IT experts can work out a way to automate CheckUser. If someone edits on a shared IP used by a previous sock that should be flagged and human CheckUsers notified so they can look at the edits and the previous sock edits and warn or block as necessary. Mztourist (talk) 05:46, 22 November 2024 (UTC)
- We already have autoblock. For cases it doesn't catch, there's an additional manual layer of blocking, where if a sock is caught on an IP that's been used before but wasn't caught by autoblock, a checkuser will block the IP if it's technically feasible, sometimes for months or years at a time. Beyond that, I don't think you can imagine just how often "someone edits on a shared IP used by a previous sock". I'm doing that right now, probably, because I'm editing through T-Mobile. Basically anyone who's ever edited in India or Nigeria has been on an IP used by a previous sock. Basically anyone who's used a large institution's WiFi. There is not any way to weed through all that noise with automation. -- Tamzin[cetacean needed] (they|xe) 05:54, 22 November 2024 (UTC)
- Addendum: An actually potentially workable innovation would be something like a system that notifies CUs if an IP is autoblocked more than once in a certain time period. That would be a software proposal for Phabricator, though, not an enwiki policy proposal, and would still have privacy implications that would need to be squared with the WMF. -- Tamzin[cetacean needed] (they|xe) 05:57, 22 November 2024 (UTC)
- I believe Tamzin has it about right, but I want to clarify a thing. If you're hypothetically using T-Mobile (and this also applies to many other ISPs and many LTAs) then the odds are very high that you're using an IP address which has never been used before. With T-Mobile, which is not unusually large by any means, you belong to at least one /32 range which contains a number of IP addresses so big that it has 30 digits. These ranges contain a huge number of users. At the other extreme you have some countries with only a handful of IPs, which everyone uses. These IPs also typically contain a huge number of users. TLDR; is someone is using a single IP on their own then we'll probably just block it, otherwise you're talking about matching a huge number of users. -- zzuuzz (talk) 03:20, 23 November 2024 (UTC)
- As I understand it, if you're hypothetically using T-Mobile, then you're not editing, because someone range-blocked the whole network in pursuit of a vandal(s). See Wikipedia:Advice to T-Mobile IPv6 users. WhatamIdoing (talk) 03:36, 23 November 2024 (UTC)
- T-Mobile USA is a perennial favourite of many of the most despicable LTAs, but that's besides the point. New users with an account can actually edit from T-Mobile. They can also edit from Jio, or Deutsche Telecom, Vodafone, or many other huge networks. -- zzuuzz (talk) 03:50, 23 November 2024 (UTC)
- As I understand it, if you're hypothetically using T-Mobile, then you're not editing, because someone range-blocked the whole network in pursuit of a vandal(s). See Wikipedia:Advice to T-Mobile IPv6 users. WhatamIdoing (talk) 03:36, 23 November 2024 (UTC)
- We already have autoblock. For cases it doesn't catch, there's an additional manual layer of blocking, where if a sock is caught on an IP that's been used before but wasn't caught by autoblock, a checkuser will block the IP if it's technically feasible, sometimes for months or years at a time. Beyond that, I don't think you can imagine just how often "someone edits on a shared IP used by a previous sock". I'm doing that right now, probably, because I'm editing through T-Mobile. Basically anyone who's ever edited in India or Nigeria has been on an IP used by a previous sock. Basically anyone who's used a large institution's WiFi. There is not any way to weed through all that noise with automation. -- Tamzin[cetacean needed] (they|xe) 05:54, 22 November 2024 (UTC)
- I would assume that IT experts can work out a way to automate CheckUser. If someone edits on a shared IP used by a previous sock that should be flagged and human CheckUsers notified so they can look at the edits and the previous sock edits and warn or block as necessary. Mztourist (talk) 05:46, 22 November 2024 (UTC)
- How would it be automated? We can't just block everyone who even sometimes shares an IP with someone, which is most editors once you factor in mobile editing and institutional WiFi. Even if we had a system that told checkusers about all shared-IP situations and asked them to investigate, what are they investigating for? The vast majority of IP overlaps will be entirely innocent, often people who don't even know each other. There's no way for a checkuser to find any signal in all that noise. So the only way a system like this would work is if checkusers manually identified IP ranges that are being used by LTAs, and then placed blocks on those ranges to restrict them from account creation... Which is what already happens. -- Tamzin[cetacean needed] (they|xe) 04:58, 22 November 2024 (UTC)
- Why can't it be automated? What are the privacy issues and what would WMF concerns be? There has to be a better system than SPI which imposes a huge burden on the filer (and often fails to catch socks) while we just leave the door open for LTAs. Mztourist (talk) 04:39, 22 November 2024 (UTC)
- Would violate the policy WP:NOTFISHING. –Novem Linguae (talk) 04:43, 22 November 2024 (UTC)
- It would apply to every new User as a protective measure against sockpuppetry, like a credit check before you get a card/overdraft. WP:NOTFISHING is archaic like the whole burdensome SPI system that forces honest users to do all the hard work of proving sockpuppetry while socks and vandals just keep being welcomed in under WP:AGF. Mztourist (talk) 05:46, 22 November 2024 (UTC)
- What you're suggesting is to just inundate checkusers with thousands of cases. The suggestion (as I understand it) removes burden from SPI filers by adding a disproportional burden on checkusers, who are already an overworked group. If you're suggesting an automated solution, then I believe IP blocks/IP range blocks and autoblock (discussed by Tamzin, above) already cover enough. It's quite hard to weigh up what you're really suggesting because it feels very vague without much detail - it sounds like you're just saying "a new SPI should be opened for every new user and IP, forever" which is not really a workable solution (for instance, 50 accounts were made in the last 15 minutes, which is about one every 18 seconds) BugGhost🦗👻 18:12, 22 November 2024 (UTC)
- And most of those accounts will make zero, one, or two edits, and then never be used again. Even if we liked this idea, doing it for every single account creation would be a waste of resources. WhatamIdoing (talk) 23:43, 22 November 2024 (UTC)
- What you're suggesting is to just inundate checkusers with thousands of cases. The suggestion (as I understand it) removes burden from SPI filers by adding a disproportional burden on checkusers, who are already an overworked group. If you're suggesting an automated solution, then I believe IP blocks/IP range blocks and autoblock (discussed by Tamzin, above) already cover enough. It's quite hard to weigh up what you're really suggesting because it feels very vague without much detail - it sounds like you're just saying "a new SPI should be opened for every new user and IP, forever" which is not really a workable solution (for instance, 50 accounts were made in the last 15 minutes, which is about one every 18 seconds) BugGhost🦗👻 18:12, 22 November 2024 (UTC)
- It would apply to every new User as a protective measure against sockpuppetry, like a credit check before you get a card/overdraft. WP:NOTFISHING is archaic like the whole burdensome SPI system that forces honest users to do all the hard work of proving sockpuppetry while socks and vandals just keep being welcomed in under WP:AGF. Mztourist (talk) 05:46, 22 November 2024 (UTC)
- No, they should not. voorts (talk/contributions) 17:23, 22 November 2024 (UTC)
- This, very bluntly, flies in the face of WMF policy with regards to use/protection of PII, and as noted by Tamzin this would result in frankly obscene amounts of collateral damage. You have absolutely no idea how frequently IP addresses get passed around (especially in the developing world or on T Mobile), such that it could feasibly have three different, unrelated, people on it over the course of a day or so. —Jéské Couriano v^_^v threads critiques 18:59, 22 November 2024 (UTC)
- Just out of curiosity: If a certain case of IPs spamming at Help Desk is any indication, would a CU be able to stop that in its track? 2601AC47 (talk|contribs) Isn't a IP anon 14:29, 23 November 2024 (UTC)
- CU's use their tools to identify socks when technical proof is necessary. The problem you're linking to is caused by one particular LTA account who is extremely obvious and doesn't really require technical proof to identify - check users would just be able to provide evidence for something that is already easy to spot. There's an essay on the distinction over at WP:DUCK BugGhost🦗👻 14:45, 23 November 2024 (UTC)
- @2601AC47: No, and that is because the user in question's MO is to abuse VPNs. Checkuser is worthless in this case because of that (but the IPs can and should be blocked for 1yr as VPNs). —Jéské Couriano v^_^v threads critiques 19:35, 26 November 2024 (UTC)
- LTA MAB is using a peer-to-peer VPN service which is similar to TOR. Blocking peer-to-peer VPN service endpoint IP addresses carries a higher risk of collateral damage because those aren't assigned to the VPN provider but rather a third party ISP who is likely to dynamically reassign the blocked address to a completely innocent party. 216.126.35.235 (talk) 00:22, 27 November 2024 (UTC)
- I slightly oppose this idea. This is not Reddit where socks are immediately banned or shadowbanned outright. Reddit doesn't have WP:DUCK as any wiki does. Ahri Boy (talk) 00:14, 25 November 2024 (UTC)
- How do you know this is how Reddit deals with ban and suspension evasion? They use advanced techniques such as device and IP fingerprinting to ban and suspend users in under an hour. 2600:1700:69F1:1410:5D40:53D:B27E:D147 (talk) 23:47, 28 November 2024 (UTC)
- I can see where this is coming from, but we must realise that checkuser is not magic pixie dust nor is it meant for fishing. - Ratnahastin (talk) 04:49, 27 November 2024 (UTC)
- The question I ask myself is why must we realize that it is not meant for fishing? To catch fish, you need to fish. The no-fishing rule is not fit for purpose, nor is it a rule that other organizations that actively search for ban evasion use. Machines can do the fishing. They only need to show us the fish they caught. Sean.hoyland (talk) 05:24, 27 November 2024 (UTC)
- I think for the same reason we don't want governments to be reading our mail and emails. If we checkuser everybody, then nobody has any privacy. Donald Albury 20:20, 27 November 2024 (UTC)
- The question I ask myself is why must we realize that it is not meant for fishing? To catch fish, you need to fish. The no-fishing rule is not fit for purpose, nor is it a rule that other organizations that actively search for ban evasion use. Machines can do the fishing. They only need to show us the fish they caught. Sean.hoyland (talk) 05:24, 27 November 2024 (UTC)
I sympathize with Mztourist. The current system is less effective than it needs to be. Ban evading actors make a lot of edits, they are dedicated hard-working folk in contentious topic areas. They can make up nearly 10% of new extendedconfirmed actors some years and the quicker an actor becomes EC the more likely they are to be blocked later for ban evasion. Their presence splits the community into two classes, the sanctionable and the unsanctionable with completely different payoff matrices. This has many consequences in contentious topic areas and significantly impacts the dynamics. The current rules are probably not good rules. Other systems have things like a 'commitment to authenticity' and actively search for ban evasion. It's tempting to burn it all down and start again, but with what? Having said that, the SPI folks do a great job. The average time from being granted extendedconfirmed to being blocked for ban evasion seems to be going down. Sean.hoyland (talk) 18:28, 22 November 2024 (UTC)
- I confess that I am doubtful about that 10% claim. WhatamIdoing (talk) 23:43, 22 November 2024 (UTC)
- WhatamIdoing, me too. I'm doubtful about everything I say because I've noticed that the chance it is slightly to hugely wrong is quite high. The EC numbers are work in progress, but I got distracted. The description "nearly 10% of new extendedconfirmed actors" is a bit misleading, because 'new' doesn't really mean new actors. It means actors that acquired EC for a given year, so newly acquired privileges. They might have registered in previous years. Also, I don't have 100% confidence in the way count EC grants because there are some edge cases, and I'm ignoring sysops. But anyway, the statement was based on this data of questionable precision. And the statement about a potential relationship between speed of EC acquisition and probability of being blocked is based on this data of questionable precision. And of course, currently undetected socks are not included, and there will be many. Sean.hoyland (talk) 03:39, 23 November 2024 (UTC)
- I'm not interested in clicking through to a Google file. Here's my back-of-the-envelope calculation: We have something like 120K accounts that would qualify for EXTCONF. Most of these are no longer active, and many stopped editing so long ago that they don't actually have the user right.
- Wikipedia is almost 24 years old. That makes convenient math: On average, since inception, 5K editors have achieved EXTCONF levels each year.
- If the 10% estimate is true, then 500 accounts per year – about 10 per week – are being created by banned editors and going undetected long enough for the accounts to make 500 edits and to work in CTOP areas. Do we even have enough WP:BANNED editors to make it plausible to expect banned editors to bring 500 accounts a year up to EXTCONF levels (plus however many accounts get started but are detected before then)? WhatamIdoing (talk) 03:53, 23 November 2024 (UTC)
- Suit yourself. I'm not interested in what interests other people or back of the envelope calculations. I'm interested in understanding the state of a system over time using evidence-based approaches by extracting data from the system itself. Let the data speak for itself. It has a lot to tell us. Then it is possible to test hypotheses and make evidence-based decisions. Sean.hoyland (talk) 04:13, 23 November 2024 (UTC)
- @WhatamIdoing, there's a sockmaster in the IPA CTOP who has made more than 100 socks. 500 new XC socks every year doesn't seem that much of a stretch in comparison. -- asilvering (talk) 19:12, 23 November 2024 (UTC)
- More than 100 XC socks? Or more than 100 detected socks, including socks with zero edits?
- Making a lot of accounts isn't super unusual, but it's a lot of work to get 100 accounts up to 500+ edits. Making 50,000 edits is a lot, even if it's your full-time job. WhatamIdoing (talk) 01:59, 24 November 2024 (UTC)
- Lots of users get it done in a couple of days, often through vandal fighting tools. It really is not that many when the edits are mostly mindless. nableezy - 00:18, 26 November 2024 (UTC)
- But that's kind of my point: "A couple of days", times 100 accounts, means 200–300 days per year. If you work five days per week and 52 weeks per year, that's 260 work days. This might be possible, but it's a full-time job.
- Since the 30-day limit is something that can't be achieved through effort, I wonder if a sudden change to, say, 6 months would produce a five-month reprieve. WhatamIdoing (talk) 02:23, 26 November 2024 (UTC)
- Who says it’s only one at a time? Icewhiz for example has had 4 plus accounts active at a time. nableezy - 02:25, 26 November 2024 (UTC)
- There is some data about ban evasion timelines for some sockmasters in PIA that show how accounts are operated in parallel. Operating multiple accounts concurrently seems to be the norm. Sean.hoyland (talk) 04:31, 26 November 2024 (UTC)
- Imagine that it takes an average of one minute to make a (convincing) edit. That means that 500 edits = 8.33 hours, i.e., more than one full work day.
- Imagine, too, that having reached this point, you actually need to spend some time using your newly EXTCONF account. This, too, takes time.
- If you operate several accounts at once, that means:
- You spend an hour editing from Account1. You spend the next hour editing from Account2. You spend another hour editing from Account3. You spend your fourth hour editing from Account4. Then you take a break for lunch, and come back to edit from Accounts 5 through 8.
- At the end of the day, you have brought 8 accounts up to 60 edits (12% of the minimum goal). And maybe one of them got blocked, too, which is lost effort. At this rate, it would take you an entire year of full-time work to get 100 EXTCONF accounts, even though you are operating multiple accounts concurrently. Doing 50 edits per day in 10 accounts is not faster than doing 500 edits in 1 account. It's the same amount of work. WhatamIdoing (talk) 05:13, 29 November 2024 (UTC)
- Sure it’s an effort, though it doesn’t take a minute an edit. But I’m not sure why I need to imagine something that has happened multiple times already. Icewhiz most recently had like 4-5 EC accounts active, and there are probably several more. Yes, there is an effort there. But also yes, it keeps happening. nableezy - 15:00, 29 November 2024 (UTC)
- My point is that "4-5 EC accounts" is not "100". WhatamIdoing (talk) 19:31, 30 November 2024 (UTC)
- It’s 4-5 at a time for a single sock master. Check the Icewhiz SPI for how many that adds up to over time. nableezy - 20:16, 30 November 2024 (UTC)
- My point is that "4-5 EC accounts" is not "100". WhatamIdoing (talk) 19:31, 30 November 2024 (UTC)
- Sure it’s an effort, though it doesn’t take a minute an edit. But I’m not sure why I need to imagine something that has happened multiple times already. Icewhiz most recently had like 4-5 EC accounts active, and there are probably several more. Yes, there is an effort there. But also yes, it keeps happening. nableezy - 15:00, 29 November 2024 (UTC)
- There is some data about ban evasion timelines for some sockmasters in PIA that show how accounts are operated in parallel. Operating multiple accounts concurrently seems to be the norm. Sean.hoyland (talk) 04:31, 26 November 2024 (UTC)
- Many of our frequent fliers are already adept at warehousing accounts for months or even years, so a bump in the time period probably won't make much off a difference. Additionally, and without going into detail publicly, there are several methods whereby semi- or even fully-automated editing can be used to get to 500 edits with a minimum of effort, or at least well within script-kid territory. Because so many of those are obvious on inspection some will assume that all of them are, but there are a number of rather subtle cases that have come up over the years and it would be foolish to assume that it isn't ongoing. 184.152.68.190 (talk) 17:31, 28 November 2024 (UTC)
- Who says it’s only one at a time? Icewhiz for example has had 4 plus accounts active at a time. nableezy - 02:25, 26 November 2024 (UTC)
- Lots of users get it done in a couple of days, often through vandal fighting tools. It really is not that many when the edits are mostly mindless. nableezy - 00:18, 26 November 2024 (UTC)
- WhatamIdoing, me too. I'm doubtful about everything I say because I've noticed that the chance it is slightly to hugely wrong is quite high. The EC numbers are work in progress, but I got distracted. The description "nearly 10% of new extendedconfirmed actors" is a bit misleading, because 'new' doesn't really mean new actors. It means actors that acquired EC for a given year, so newly acquired privileges. They might have registered in previous years. Also, I don't have 100% confidence in the way count EC grants because there are some edge cases, and I'm ignoring sysops. But anyway, the statement was based on this data of questionable precision. And the statement about a potential relationship between speed of EC acquisition and probability of being blocked is based on this data of questionable precision. And of course, currently undetected socks are not included, and there will be many. Sean.hoyland (talk) 03:39, 23 November 2024 (UTC)
Also, if we divide the space into contentious vs not-contentious, maybe a one size fits all CU policy doesn't make sense. Sean.hoyland (talk) 18:55, 22 November 2024 (UTC)
Terrible idea. Let's AGF that most new users are here to improve Wikipedia instead of damage it. Some1 (talk) 18:33, 22 November 2024 (UTC)
- Ban evading actors who employ deception via sockpuppetry in the WP:PIA topic area are here to improve Wikipedia, from their perspective, rather than damage it. There is no need to use faith. There are statistics. There is a probability that a 'new user' is employing ban evasion. Sean.hoyland (talk) 18:46, 22 November 2024 (UTC)
- My initial comment wasn't a direct response to yours, but new users and IPs won't be able to edit in the WP:PIA topic area anyway since they need to be extended confirmed. Some1 (talk) 20:08, 22 November 2024 (UTC)
- Let's not hold up the way PIA handles new users and IPs, in which they are allowed to post to talk pages but then have their talk page post removed if it doesn't fall within very specific parameters, as some sort of model. CMD (talk) 02:51, 23 November 2024 (UTC)
- My initial comment wasn't a direct response to yours, but new users and IPs won't be able to edit in the WP:PIA topic area anyway since they need to be extended confirmed. Some1 (talk) 20:08, 22 November 2024 (UTC)
Strongly support automatically checkusering all active users (new and existing) at regular intervals. If it were automated -- e.g., a script runs that compares IPs, user agent, other typical subscriber info -- there would be no privacy violation, because that information doesn't have to be disclosed to any human beings. Only the "hits" can be forwarded to the CU team for follow-up. I'd run that script daily. If the policy forbids it, we should change the policy to allow it. It's mind-boggling that Wikipedia doesn't do this already. It's a basic security precaution. (Also, email-required registration and get rid of IP editing.) Levivich (talk) 02:39, 23 November 2024 (UTC)
- I don't think you've been reading the comments from people who know what they are talking about. There would be hundreds, at least, of hits per day that would require human checking. The policy that prohibits this sort of massive breach of privacy is the Foundation's and so not one that en.wp could change even if it were a good idea (which it isn't). Thryduulf (talk) 03:10, 23 November 2024 (UTC)
- A computer can be programmed to check for similarities or patterns in subscriber info (IP, etc), and in editing activity (time cards, etc), and content of edits and talk page posts (like the existing language similarity tool), with various degrees of certainty in the same way the Cluebot does with ORES when it's reverting vandalism. And the threshold can be set so it only forwards matches of a certain certainty to human CUs for review, so as not to overwhelm the humans. The WMF can make this happen with just $1 million of its $180 million per year (and it wouldn't be violating its own policies if it did so). Enwiki could ask for it, other projects might join too. Levivich (talk) 05:24, 23 November 2024 (UTC)
- "Oh now I see what you mean, Levivich, good point, I guess you know what you're talking about, after all."
- "Thanks, Thryduulf!" Levivich (talk) 17:42, 23 November 2024 (UTC)
- I seem to have missed this comment, sorry. However I am very sceptical that sockpuppet detection is meaningfully automatable. From what CUs say it is as much art as science (which is why SPI cases can result in determinations like "possilikely"). This is the sort of thing that is difficult (at best) to automate. Additionally the only way to reliably develop such automation would be for humans analyse and process a massive amount of data from accounts that both are and are not sockpuppets and classify results as one or the other, and that anaylsis would be a massive privacy violation on its own. Assuming you have developed this magic computer that can assign a likelihood of any editor being a sock of someone who has edited in the last three months (data older than that is deleted) on a percentage scale, you then have to decide what level is appropriate to send to humans to check. Say for the sake of argument it is 75%, that means roughly one in four people being accused are innocent and are having their privacy impinged unnecessarily - and how many CUs are needed to deal with this caseload? Do we have enough? SPI isn't exactly backlog free and there aren't hoards of people volunteering for the role (although unbreaking RFA might help with this in the medium to long term). The more you reduce the number sent to CUs to investigate, the less benefit there is over the status quo.
- In addition to all the above, how similar is "similar" in terms of articles edited, writing style, timecard, etc? How are you avoiding legitimate sockpuppets? Thryduulf (talk) 18:44, 23 November 2024 (UTC)
- You know this already but for anyone reading this who doesn't: when a CU "checks" somebody, it's not like they send a signal out to that person's computer to go sniffing around. In fact, all the subscriber info (IP address, etc.) is already logged on the WMF's server logs (as with any website). A CU "check" just means a volunteer CU gets to look at a portion of those logs (to look up a particular account's subscriber info). That's the privacy concern: we have rules, rightfully so, about when volunteer CUs (not WMF staff) can read the server logs (or portions of them). Those rules do not apply to WMF staff, like devs and maintenance personnel, nor do they apply to the WMF's own software reading its own logs. Privacy is only an issue when those logs are revealed to volunteer CUs.
- So... feeding the logs into software in order to train the software doesn't violate anyone's policy. It's just letting a computer read its own files. Human verification of the training outcomes also doesn't have to violate anyone's privacy -- just don't use volunteer CUs to do it, use WMF staff. Or, anonymize the training data (changing usernames to "Example1", "Example2", etc.). Or use historical data -- which would certainly be part of the training, since the most effective way would be to put known socks into the training data to see if the computer catches them.
- Anyway, training the system won't violate anyone's privacy.
- As for the hit rate -- 75% would be way, way too low. We'd be looking for definitely over 90% or 95%, and probably more like 99.something percent. Cluebot doesn't get vandalism wrong 1 out of 4 times, neither should CluebotCU. Heck, if CluebotCU can't do better than 75%, it's not worth doing. A more interesting question is whether the 99.something% hit rate would be helpful to CUs, or whether that would only catch the socks that are so obvious you don't even need CU to recognize them. Only testing in the field would tell.
- But overall, AI looking for patterns, and checking subscriber info, edit patterns, and the content of edits, would be very helpful in tamping down on socking, because the computer can make far more checks than a human (a computer can look at 1,000 accounts and a 100,000 edits no problem, which no human can do), it'll be less biased than humans, and it can do it all without violating anyone's privacy -- in fact, lowering the privacy violations by lowering the false positives, sending only high-probability (90%+, not 75%+) to humans for review. And it can all be done with existing technology, and the WMF has the money to do it. Levivich (talk) 19:38, 23 November 2024 (UTC)
- The more you write the clearer you make it that you don't understand checkuser or the WMF's policies regarding privacy. It's also clear that I'm not going to convince you that this is unworkable so I'll stop trying. Thryduulf (talk) 20:42, 23 November 2024 (UTC)
- Yeah it's weird how repeatedly insulting me hasn't convinced me yet. Levivich (talk) 20:57, 23 November 2024 (UTC)
- If you are are unable to distinguish between reasoned disagreement and insults, then it's not at all weird that reasoned disagreement fails to convince you. Thryduulf (talk) 22:44, 23 November 2024 (UTC)
- Yeah it's weird how repeatedly insulting me hasn't convinced me yet. Levivich (talk) 20:57, 23 November 2024 (UTC)
- @Levivich: Whatever existing data set we have has too many biases to be useful for this, and this is going to be prone to false positives. AI needs lots of data to be meaningfully trained. Also, AI here would be learning a function; when the output is not in fact a function of the input, there's nothing for an AI model to target, and this is very much the case here. On Wikidata, where I am a CheckUser, almost all edit summaries are automated even for human edits (just like clicking the rollback button is, or undoing an edit is by default), and it is very hard to meaningfully tell whether someone is a sock or not without highly case-specific analysis. No AI model is better than the data it's trained on.
- Also, about the privacy policy: you are completely incorrect when you
"Those rules do not apply to WMF staff, like devs and maintenance personnel, nor do they apply to the WMF's own software reading its own logs"
. Staff can only access that information on a need to know basis, just like CheckUsers, and data privacy laws like the EU's and California's means you cannot just do whatever random thing you want with the information you collect from users about them.--Jasper Deng (talk) 21:56, 23 November 2024 (UTC)- So which part of the wmf:Privacy Policy would prohibit the WMF from developing an AI that looks at server logs to find socks? Do you want me to quote to you the portions that explicitly disclose that the WMF uses personal information to develop tools and improve security? Levivich (talk) 22:02, 23 November 2024 (UTC)
- I mean yeah that would probably be more productive than snarky bickering BugGhost🦗👻 22:05, 23 November 2024 (UTC)
- @Levivich: Did you read the part where I mentioned privacy laws? Also, in this industry no one is allowed unfettered usage of private data even internally; there are internal policies that govern this that are broadly similar to the privacy policy. It's one thing to test a proposed tool on an IP address like Special:Contribs/2001:db8::/32, but it's another to train an AI model on it. Arguably an equally big privacy concern is the usage of new data from new users after the model is trained and brought online. The foundation is already hiding IP addresses by default even for anonymous users soon, and they will not undermine that mission through a tool like this. Ultimately, the Board of Trustees has to assume legal responsibility and liability for such a thing; put yourself in their position and think of whether they'd like the liability of something like this.--Jasper Deng (talk) 22:13, 23 November 2024 (UTC)
- So can you quote a part of the privacy policy, or a part of privacy laws, or anything, that would prohibit feeding server logs into a "Cluebot-CU" to find socking?
- Because I can quote the part of the wmf:Privacy Policy that allows it, and it's a lot:
Yeah that's a lot. Then there's this whole FAQ that saysWe may use your public contributions, either aggregated with the public contributions of others or individually, to create new features or data-related products for you or to learn more about how the Wikimedia Sites are used ...
Because of how browsers work, we receive some information automatically when you visit the Wikimedia Sites ... This information includes the type of device you are using (possibly including unique device identification numbers, for some beta versions of our mobile applications), the type and version of your browser, your browser's language preference, the type and version of your device's operating system, in some cases the name of your internet service provider or mobile carrier, the website that referred you to the Wikimedia Sites, which pages you request and visit, and the date and time of each request you make to the Wikimedia Sites.
Put simply, we use this information to enhance your experience with Wikimedia Sites. For example, we use this information to administer the sites, provide greater security, and fight vandalism; optimize mobile applications, customize content and set language preferences, test features to see what works, and improve performance; understand how users interact with the Wikimedia Sites, track and study use of various features, gain understanding about the demographics of the different Wikimedia Sites, and analyze trends. ...
We actively collect some types of information with a variety of commonly-used technologies. These generally include tracking pixels, JavaScript, and a variety of "locally stored data" technologies, such as cookies and local storage. ... Depending on which technology we use, locally stored data may include text, Personal Information (like your IP address), and information about your use of the Wikimedia Sites (like your username or the time of your visit). ... We use this information to make your experience with the Wikimedia Sites safer and better, to gain a greater understanding of user preferences and their interaction with the Wikimedia Sites, and to generally improve our services. ...
We and our service providers use your information ... to create new features or data-related products for you or to learn more about how the Wikimedia Sites are used ... To fight spam, identity theft, malware and other kinds of abuse. ... To test features to see what works, understand how users interact with the Wikimedia Sites, track and study use of various features, gain understanding about the demographics of the different Wikimedia Sites and analyze trends. ...
When you visit any Wikimedia Site, we automatically receive the IP address of the device (or your proxy server) you are using to access the Internet, which could be used to infer your geographical location. ... We use this location information to make your experience with the Wikimedia Sites safer and better, to gain a greater understanding of user preferences and their interaction with the Wikimedia Sites, and to generally improve our services. For example, we use this information to provide greater security, optimize mobile applications, and learn how to expand and better support Wikimedia communities. ...
We, or particular users with certain administrative rights as described below, need to use and share your Personal Information if it is reasonably believed to be necessary to enforce or investigate potential violations of our Terms of Use, this Privacy Policy, or any Wikimedia Foundation or user community-based policies. ... We may also disclose your Personal Information if we reasonably believe it necessary to detect, prevent, or otherwise assess and address potential spam, malware, fraud, abuse, unlawful activity, and security or technical concerns. ... To facilitate their work, we give some developers limited access to systems that contain your Personal Information, but only as reasonably necessary for them to develop and contribute to the Wikimedia Sites. ...
It is important for us to be able to make sure everyone plays by the same rules, and sometimes that means we need to investigate and share specific users' information to ensure that they are.
For example, user information may be shared when a CheckUser is investigating abuse on a Project, such as suspected use of malicious "sockpuppets" (duplicate accounts), vandalism, harassment of other users, or disruptive behavior. If a user is found to be violating our Terms of Use or other relevant policy, the user's Personal Information may be released to a service provider, carrier, or other third-party entity, for example, to assist in the targeting of IP blocks or to launch a complaint to the relevant Internet Service Provider.
- So using IP addresses, etc., to develop new tools, to test features, to fight violations of the Terms of Use, and disclosing that info to Checkusers... all explicitly permitted by the Privacy Policy. Levivich (talk) 22:22, 23 November 2024 (UTC)
- @Levivich:
"We, or particular users with certain administrative rights as described below, need to use and share your Personal Information if it is reasonably believed to be necessary to enforce or investigate potential violations of our Terms of Use"
– "reasonably believed to be necessary" is not going to hold up in court when it's sweepingly applied to everyone. This doesn't even take into consideration the laws I mentioned, like GDPR. I'm not a lawyer, and I'm guessing neither are you. If you want to be the one assuming the legal liability for this, contact the board today and sign the contract. Even then they would probably not agree to such an arrangement. So you're preaching to the choir: only the foundation could even consider assuming this risk. Also, it's clear that you do not have a single idea of how developing something like this works if you think it can be done for $1 million. Something this complex has to be done right and tech salaries and computing resources are expensive.--Jasper Deng (talk) 22:28, 23 November 2024 (UTC)- What I am suggesting does not involve sharing everyone's data with Checkusers. It's pretty obvious that looking at their own server logs is "necessary to enforce or investigate potential violations of our Terms of Use". Five people is how big the WMF's wmf:Machine Learning team is, @ $200k each, $1m/year covers it. Five people is enough for that team to improve ORES, so another five-person team dedicated to "ORES-CU" seems a reasonable place to start. They could double that, and still have like $180M left over. Levivich (talk) 22:40, 23 November 2024 (UTC)
- @Levivich: Yeah no, lol. $200k each is not a very competitive total compensation, considering that that needs to include benefits, health insurance, etc. This doesn't include their manager or the hefty hardware required to run ML workflows. It doesn't include the legal support required given the data privacy law compliance needed. Capriciously looking at the logs does not count; accessing data of users the foundation cannot reasonably have said to be likely to cause abuse is not permissible. This all aside from the bias and other data quality issues at hand here. You can delude yourself all you want, but nature cannot be fooled. I'm finished arguing with you anyways, because this proposal is either way dead on arrival.--Jasper Deng (talk) 23:45, 23 November 2024 (UTC)
- @Jasper Deng, haggling over the math here isn't really important. You could quintuple the figures @Levivich gave and the Foundation would still have millions upon millions of dollars left over. -- asilvering (talk) 23:48, 23 November 2024 (UTC)
- @Asilvering: The point I'm making is Levivich does not understand the complexity behind this kind of thing and thus his arguments are not to be given weight by the closer. Jasper Deng (talk) 23:56, 23 November 2024 (UTC)
- As a statistician/data scientist, @Levivich is correct about the technical side of this—building an ML algorithm to detect sockpuppets would be pretty easy. Duplicate user algorithms like these are common across many websites. For a basic classification task like this (basically an ML 101 homework problem), I think $1 million is about right. As a bonus, the same tools could be used to identify and correct for possible canvasing or brigading, which behaves a lot like sockpuppetry from a statistical perspective. A similar algorithm is already used by Twitter's community notes feature.
- IANAL, so I can't comment on the legal side of this, and I can't comment on whether that money would be better-spent elsewhere since I don't know what the WMF budget looks like. Overall though, the technical implementation wouldn't be a major hurdle. – Closed Limelike Curves (talk) 20:44, 24 November 2024 (UTC)
- Third-party services like Sift.com provide this kind of algorithm-based account fraud protection as an alternative to building and maintaining internally. czar 23:41, 24 November 2024 (UTC)
- Building such a model is only a small part of a real production system. If this system is to operate on all account creations, it needs to be at least as reliable as the existing systems that handle account creations. As you probably know, data scientists developing such a model need to be supported by software engineers and site reliability engineers supporting the actual system. Then you have the problem of new sockers who are not on the list of sockmasters to check against. Non-English-language speakers often would be put at a disadvantage too. It's not as trivial as you make it out to be, thus I stand by my estimate.--Jasper Deng (talk) 06:59, 25 November 2024 (UTC)
- None of you have accounted for Hofstadter's law.
- I don't think we need to spend more time speculating about a system that WMF Legal is extremely unlikely to accept. Even if they did, it wouldn't exist until several years from now. Instead, let's try to think of things that we can do ourselves, or with only a very little assistance. Small, lightweight projects with full community control can help us now, and if we prove that ____ works, the WMF might be willing to adopt and expand it later. WhatamIdoing (talk) 23:39, 25 November 2024 (UTC)
- That's a mistake -- doing the same thing Wikipedia has been doing for 20+ years. The mistake is in leaving it to volunteers to catch sockpuppetry, rather than insisting that the WMF devote significant resources to it. And it's a mistake because the one thing we volunteers can't do, that the WMF can do, is comb through the server logs looking for patterns. Levivich (talk) 23:44, 25 November 2024 (UTC)
- Not sure about the "building an ML algorithm to detect sockpuppets would be pretty easy" part, but I admire the optimism. It is certainly the case that it is possible, and people have done it with a surprising level of success a very long time ago in ML terms e.g. https://doi.org/10.1016/j.knosys.2018.03.002. These projects tend to rely on the category graph to distinguish sock and non-sock sets for training, the categorization of accounts as confirmed or suspected socks. However, the category graph is woefully incomplete i.e. there is information in the logs that is not reflected in the graph, so ensuring that all ban evasion accounts are properly categorized as such might help a bit. Sean.hoyland (talk) 03:58, 26 November 2024 (UTC)
- Thankfully, we wouldn't have to build an ML algorithm, we can just use one of the existing ones. Some are even open source. Or WMF could use a third party service like the aforementioned sift.com. Levivich (talk) 16:17, 26 November 2024 (UTC)
- Let me guess: Essentially, you would like their machine-learning team to use Sift's
AI-Powered Fraud Protection
, which from what I can glance, handlessafeguarding subscriptions to defending digital content and in-app purchases
andhelps businesses reduce friction and stop sophisticated fraud attacks that gut growth
, to provide the ability for us toautomatically checkuser all active users
? 2601AC47 (talk·contribs·my rights) Isn't a IP anon 16:25, 26 November 2024 (UTC)- The WMF already has the ability to "automatically checkuser all users" (the verb "checkuser" just means "look at the server logs"), I'm suggesting they use it. And that they use it in a sophisticated way, employing (existing, open source or commercially available) AI/ML technologies, like the same kind we already use to automatically revert vandalism. Contrary to claims here, doing so would not be illegal or even expensive (comparatively, for the WMF). Levivich (talk) 16:40, 26 November 2024 (UTC)
- So, in my attempt to get things set right and steer towards a consensus that is satisfactory, I sincerely follow-up: What lies beyond that in this vast, uncharted sea? And could this mean any more in the next 5 years? 2601AC47 (talk·contribs·my rights) Isn't a IP anon 16:49, 26 November 2024 (UTC)
- What lies beyond is mw:Extension:SimilarEditors. Levivich (talk) 17:26, 26 November 2024 (UTC)
- So, @2601AC47, I think the answer to your question is "tell the WMF we really, really, really would like more attention to sockpuppetry and IP abuse from the ML team". -- asilvering (talk) 17:31, 26 November 2024 (UTC)
- Which I don't suppose someone can at the next board meeting on December 11? 2601AC47 (talk·contribs·my rights) Isn't a IP anon 18:00, 26 November 2024 (UTC)
- So, @2601AC47, I think the answer to your question is "tell the WMF we really, really, really would like more attention to sockpuppetry and IP abuse from the ML team". -- asilvering (talk) 17:31, 26 November 2024 (UTC)
- What lies beyond is mw:Extension:SimilarEditors. Levivich (talk) 17:26, 26 November 2024 (UTC)
- So, in my attempt to get things set right and steer towards a consensus that is satisfactory, I sincerely follow-up: What lies beyond that in this vast, uncharted sea? And could this mean any more in the next 5 years? 2601AC47 (talk·contribs·my rights) Isn't a IP anon 16:49, 26 November 2024 (UTC)
- The WMF already has the ability to "automatically checkuser all users" (the verb "checkuser" just means "look at the server logs"), I'm suggesting they use it. And that they use it in a sophisticated way, employing (existing, open source or commercially available) AI/ML technologies, like the same kind we already use to automatically revert vandalism. Contrary to claims here, doing so would not be illegal or even expensive (comparatively, for the WMF). Levivich (talk) 16:40, 26 November 2024 (UTC)
- I may also point to this, where they mention
development in other areas, such as social media features and machine learning expertise
. 2601AC47 (talk·contribs·my rights) Isn't a IP anon 16:36, 26 November 2024 (UTC)- e.g. m:Research:Sockpuppet_detection_in_Wikimedia_projects Sean.hoyland (talk) 17:02, 26 November 2024 (UTC)
- And that mentions Socksfinder, still in beta it seems. 2601AC47 (talk·contribs·my rights) Isn't a IP anon 17:10, 26 November 2024 (UTC)
- 3 days! When I first posted my comment and some editors responded that I didn't know what I was talking about, it can't be done, it'd violate the privacy policy and privacy laws, WMF Legal would never allow it... I was wondering how long it would take before somebody pointed out that this thing that can't be done has already been done and has been under development for at least 7 years now.
- Of course it's already under development, it's pretty obvious that the same Wikipedia that developed ClueBot, one of the world's earlier and more successful examples of ML applications, would try to employ ML to fight multiple-account abuse. I mean, I'm obviously not gonna be the first person to think of this "innovation"!
- Anyway, it took 3 days. Thanks, Sean! Levivich (talk) 17:31, 26 November 2024 (UTC)
- e.g. m:Research:Sockpuppet_detection_in_Wikimedia_projects Sean.hoyland (talk) 17:02, 26 November 2024 (UTC)
- Let me guess: Essentially, you would like their machine-learning team to use Sift's
- Thankfully, we wouldn't have to build an ML algorithm, we can just use one of the existing ones. Some are even open source. Or WMF could use a third party service like the aforementioned sift.com. Levivich (talk) 16:17, 26 November 2024 (UTC)
- Unlike what is being proposed, SimilarEditors only works based on publicly available data (e.g. similarities in editing patterns), and not IP data. To quote the page Sean linked,
in the model's current form, we are only considering public data, but most saliently private data such as IP addresses or user-agent information are features currently used by checkusers that could be later (carefully) incorporated into the models
.So, not only the current model doesn't look at IP data, the research project also acknowledges that actually using such data should only be done in a "careful" way, because of those very same privacy policy issues quoted above.On the ML side, however, this does proves that it's being worked on, and I'm honestly not surprised at all that the WMF is working on machine learning-based tools to detect sockpuppets. Chaotic Enby (talk · contribs) 17:50, 26 November 2024 (UTC)- Right. We should ask WMF to do the
later (carefully) incorporated into the models
part (especially since it's now later). BTW, the SimilarUsers API already pulls IP and other metadata. SimilarExtensions (a tool that uses the API) doesn't release that information to CheckUsers, by design. And that's a good thing, we can't just release all IPs to CheckUsers, it does indeed have to be done carefully. But user metadata can be used. What I'm suggesting is that the WMF should proceed to develop these types of tools (including the careful use of user metadata). Levivich (talk) 17:57, 26 November 2024 (UTC)
- Right. We should ask WMF to do the
- Not really clear that they're pulling IP data from logged-in users. The relevant sections reads:
This reads like they're collecting the username or IP depending on whether they're a logged-in user or an IP user. Chaotic Enby (talk · contribs) 18:14, 26 November 2024 (UTC)USER_METADATA
(203MB): for every user inCOEDIT_DATA
, this contains basic metadata about them (total number of edits in data, total number of pages edited, user or IP, timestamp range of edits). - In a few years people might look back on these days when we only had to deal with simple devious primates employing deception as the halcyon days. Sean.hoyland (talk) 18:33, 26 November 2024 (UTC)
- Not sure about the "building an ML algorithm to detect sockpuppets would be pretty easy" part, but I admire the optimism. It is certainly the case that it is possible, and people have done it with a surprising level of success a very long time ago in ML terms e.g. https://doi.org/10.1016/j.knosys.2018.03.002. These projects tend to rely on the category graph to distinguish sock and non-sock sets for training, the categorization of accounts as confirmed or suspected socks. However, the category graph is woefully incomplete i.e. there is information in the logs that is not reflected in the graph, so ensuring that all ban evasion accounts are properly categorized as such might help a bit. Sean.hoyland (talk) 03:58, 26 November 2024 (UTC)
- I assumed 1 million USD/year was accounting for Hofstadter's law several times over. Otherwise it feels wildly pessimistic. – Closed Limelike Curves (talk) 15:57, 26 November 2024 (UTC)
- That's a mistake -- doing the same thing Wikipedia has been doing for 20+ years. The mistake is in leaving it to volunteers to catch sockpuppetry, rather than insisting that the WMF devote significant resources to it. And it's a mistake because the one thing we volunteers can't do, that the WMF can do, is comb through the server logs looking for patterns. Levivich (talk) 23:44, 25 November 2024 (UTC)
- @Jasper Deng, haggling over the math here isn't really important. You could quintuple the figures @Levivich gave and the Foundation would still have millions upon millions of dollars left over. -- asilvering (talk) 23:48, 23 November 2024 (UTC)
- @Levivich: Yeah no, lol. $200k each is not a very competitive total compensation, considering that that needs to include benefits, health insurance, etc. This doesn't include their manager or the hefty hardware required to run ML workflows. It doesn't include the legal support required given the data privacy law compliance needed. Capriciously looking at the logs does not count; accessing data of users the foundation cannot reasonably have said to be likely to cause abuse is not permissible. This all aside from the bias and other data quality issues at hand here. You can delude yourself all you want, but nature cannot be fooled. I'm finished arguing with you anyways, because this proposal is either way dead on arrival.--Jasper Deng (talk) 23:45, 23 November 2024 (UTC)
- What I am suggesting does not involve sharing everyone's data with Checkusers. It's pretty obvious that looking at their own server logs is "necessary to enforce or investigate potential violations of our Terms of Use". Five people is how big the WMF's wmf:Machine Learning team is, @ $200k each, $1m/year covers it. Five people is enough for that team to improve ORES, so another five-person team dedicated to "ORES-CU" seems a reasonable place to start. They could double that, and still have like $180M left over. Levivich (talk) 22:40, 23 November 2024 (UTC)
- @Levivich:
- So which part of the wmf:Privacy Policy would prohibit the WMF from developing an AI that looks at server logs to find socks? Do you want me to quote to you the portions that explicitly disclose that the WMF uses personal information to develop tools and improve security? Levivich (talk) 22:02, 23 November 2024 (UTC)
- The more you write the clearer you make it that you don't understand checkuser or the WMF's policies regarding privacy. It's also clear that I'm not going to convince you that this is unworkable so I'll stop trying. Thryduulf (talk) 20:42, 23 November 2024 (UTC)
- A computer can be programmed to check for similarities or patterns in subscriber info (IP, etc), and in editing activity (time cards, etc), and content of edits and talk page posts (like the existing language similarity tool), with various degrees of certainty in the same way the Cluebot does with ORES when it's reverting vandalism. And the threshold can be set so it only forwards matches of a certain certainty to human CUs for review, so as not to overwhelm the humans. The WMF can make this happen with just $1 million of its $180 million per year (and it wouldn't be violating its own policies if it did so). Enwiki could ask for it, other projects might join too. Levivich (talk) 05:24, 23 November 2024 (UTC)
IP range 2600:1700:69F1:1410:0:0:0:0/64 blocked by a CU |
---|
The following discussion has been closed. Please do not modify it. |
|
- Any such system would be subject to numerous biases or be easily defeatable. Such an automated anti-abuse system would have to be exclusively a foundation initiative as only they have the resources for such a monumental undertaking. It would need its own team of developers.--Jasper Deng (talk) 18:57, 23 November 2024 (UTC)
Absolutely no chance that this would pass. WP:SNOW, even though there isn't a flood of opposes. There are two problems:
- The existing CheckUser team barely has the bandwidth for the existing SPI load. Doing this on every single new user would be impractical and would enable WP:LTA's by diverting valuable CheckUser bandwidth.
- Even if we had enough CheckUser's, this would be a severe privacy violation absolutely prohibited under the Foundation privacy policy.
The vast majority of vandals and other disruptive users don't need CU involvement to deal with. There's very little to be gained from this.--Jasper Deng (talk) 18:36, 23 November 2024 (UTC)
- It is perhaps an interesting conversation to have but I have to agree that it is unworkable, and directly contrary to foundation-level policy which we cannot make a local exemption to. En.wp, I believe, already has the largest CU team of any WMF project, but we would need hundreds more people on that team to handle something like this. In the last round of appointments, the committee approved exactly one checkuser, and that one was a returning former mamber of the team. And there is the very real risk that if we appointed a whole bunch of new CUs, some of them would abuse the tool. Just Step Sideways from this world ..... today 18:55, 23 November 2024 (UTC)
- And its worth pointing out that the Committee approving too few volunteers for Checkuser (regardless of whether you think they are or aren't) is not a significant part of this issue. There simply are not tens of people who are putting themselves forward for consideration as CUs. Since 2016 54 applications (an average of per year) have been put forward for consideration by Functionaries (the highest was 9, the lowest was 2). Note this is total applications not applicants (more than one person has applied multiple times), and is not limited to candidates who had a realistic chance of being appointed. Thryduulf (talk) 20:40, 23 November 2024 (UTC)
- The dearth of candidates has for sure been an ongoing thing, it's worth reminding admins that they don't have to wait for the committee to call for candidates, you can put your name forward at any time by emailing the committee. Just Step Sideways from this world ..... today 23:48, 24 November 2024 (UTC)
- And its worth pointing out that the Committee approving too few volunteers for Checkuser (regardless of whether you think they are or aren't) is not a significant part of this issue. There simply are not tens of people who are putting themselves forward for consideration as CUs. Since 2016 54 applications (an average of per year) have been put forward for consideration by Functionaries (the highest was 9, the lowest was 2). Note this is total applications not applicants (more than one person has applied multiple times), and is not limited to candidates who had a realistic chance of being appointed. Thryduulf (talk) 20:40, 23 November 2024 (UTC)
- Generally, I tend to get the impression from those who have checkuser rights that CU should be done as a last resort, and other, less invasive methods are preferred, and it would seem that indiscriminate use of it would be a bad idea, so I would have some major misgivings about this proposal. And given the ANI case, the less user information that we retain, the better (which is also probably why temporary accounts are a necessary and prudent idea despite other potential drawbacks). Abzeronow (talk) 03:56, 23 November 2024 (UTC)
- Oppose. A lot has already been written on the unsustainable workload for the CU team this would create and the amount of collateral damage; I'll add in the fact that our most notorious sockmasters in areas like PIA already use highly sophisticated methods to evade CU detection, and based on what I've seen at the relevant SPIs most of the blocks in these cases are made with more weight given to the behaviour, and even then only after lengthy deliberations on the matter. These sort of sockmasters seem to have been in the OP's mind when the request was made, and I do not see automated CU being of any more use than current techniques against such dedicated sockmasters. And, has been mentioned before, most cases of sockpuppetry (such as run-of-the-mill vandals and trolls using throwaway accounts for abuse) don't need CU anyways. JavaHurricane 08:17, 24 November 2024 (UTC)
- These are, unfortunately, fair points about the limits of CU and the many experienced and dedicated ban evading actors in PIA. CU information retention policy is also a complicating factor. Sean.hoyland (talk) 08:28, 24 November 2024 (UTC)
- As I said in my original post, recidivist socks often get better at covering their "tells" each time making behavioural detection increasingly difficult and meaning the entire burden falls on the honest user to convince an Admin to take an SPI case seriously with scarce evidence. After many years I'm tired of defending various pages from sock POV edits and if WMF won't make life easier then increasingly I just won't bother, I'm sure plenty of other users feel the same way. Mztourist (talk) 05:45, 26 November 2024 (UTC)
- These are, unfortunately, fair points about the limits of CU and the many experienced and dedicated ban evading actors in PIA. CU information retention policy is also a complicating factor. Sean.hoyland (talk) 08:28, 24 November 2024 (UTC)
SimilarEditors
[edit]The development of mw:Extension:SimilarEditors -- the type of tool that could be used to do what Mztourist suggests -- has been "stalled" since 2023 and downgraded to low-priority in 2024, according to its documentation page and related phab tasks (see e.g. phab:T376548, phab:T304633, phab:T291509). Anybody know why? Levivich (talk) 17:43, 26 November 2024 (UTC)
- Honestly, the main function of that sort of thing seems to be compiling data that is already available on XTools and various editor interaction analyzers, and then presenting it nicely and neatly. I think that such a page could be useful as a sanity check, and it might even be worth having that sort of thing as a standalone toolforge app, but I don't really see why the WMF would make that particular extension a high priority. — Red-tailed hawk (nest) 17:58, 26 November 2024 (UTC)
- Well, it doesn't have to be that particular extension, but it seems to me that the entire "idea" has been stalled, unless they're working on another tool that I'm unaware of (very possible). (Or, it could be because of recent changes in domestic and int'l privacy laws that derailed their previous development advances, or it could be because of advancements in ML elsewhere making in-house development no longer practical.)
As to why the WMF would make this sort of problem a high priority, I'd say because the spread of misinformation on Wikipedia by sockpuppets is a big problem. Even without getting into the use of user metadata, just look at recent SPIs I filed, like Wikipedia:Sockpuppet investigations/Icewhiz/Archive#27 August 2024 and Wikipedia:Sockpuppet investigations/Icewhiz/Archive#09 October 2024. That involved no private data at all, but a computer could have done automatically, in seconds, what took me hours to do manually, and those socks could have been uncovered before they made thousands and thousands of edits spreading misinformation. If the computer looked at private data as well as public data, it would be even more effective (and would save CUs time as well). Seems to me to be a worthy expenditure of 0.5% or 1% of the WMF's annual budget. Levivich (talk) 18:09, 26 November 2024 (UTC)
- Well, it doesn't have to be that particular extension, but it seems to me that the entire "idea" has been stalled, unless they're working on another tool that I'm unaware of (very possible). (Or, it could be because of recent changes in domestic and int'l privacy laws that derailed their previous development advances, or it could be because of advancements in ML elsewhere making in-house development no longer practical.)
- This looks really interesting. I don't really know how extensions are rolled out to individual wikis - can anyone with knowledge about that summarise if having this tool turned on (for check users/relevant admins) for en.wp is feasible? Do we need a RFC, or is this a "maybe wait several years for a phab ticket" situation? BugGhost🦗👻 18:09, 26 November 2024 (UTC)
- I find it amusing that ~4 separate users above are arguing that automatic identification of sockpuppets is impossible, impractical, and the WMF would never do it—and meanwhile, the WMF is already doing it. – Closed Limelike Curves (talk) 19:29, 27 November 2024 (UTC)
- So, discussion is over? 2601AC47 (talk·contribs·my rights) Isn't a IP anon 19:31, 27 November 2024 (UTC)
- I think what's happening is that people are having two simultaneous discussions – automatic identification of sockpuppets is already being done, but what people say "the WMF would never do" is using private data (e.g. IP addresses) to identify them. Which adds another level of (ethical, if not legal) complications compared to what SimilarEditors is doing (only processing data everyone can access, but in an automated way). Chaotic Enby (talk · contribs) 07:59, 28 November 2024 (UTC)
- "automatic identification of sockpuppets is already being done" is probably an overstatement, but I agree that there may be a potential legal and ethical minefield between the Similarusers service that uses public information available to anyone from the databases after redaction of private information (i.e. course-grained sampling of revision timestamps combined with an attempt to quantify page intersection data), and a service that has access to the private information associated with a registered account name. Sean.hoyland (talk) 11:15, 28 November 2024 (UTC)
- The WMF said they're planning on incorporating IP addresses and device info as well! – Closed Limelike Curves (talk) 21:21, 29 November 2024 (UTC)
- Yes, automatic identification of (these) sockpuppets is impossible. There are many reasons for this, but the simplest one is this: These types of tools require hundreds of edits – at minimum – to return any viable data, and the sort of sockmasters who get accounts up to that volume of edits know how to evade detection by tools that analyse public information. The markers would likely indicate people from similar countries – naturally, two Cypriots would be interested in Category:Cyprus and over time similar hour and day overlaps will emerge, but what's to let you know whether these are actual socks when they're evading technical analysis? You're back to square one. There are other tools such as mediawikiwiki:User:Ladsgroup/masz which I consider equally circumstantial; an analysis of myself returns a high likelihood of me being other administrators and arbitrators, while analysing an alleged sock currently at SPI returns the filer as the third most likely sockmaster. This is not commentary on the tools themselves, but rather simply the way things are. DatGuyTalkContribs 17:42, 28 November 2024 (UTC)
- Oh, fun! Too bad it's CU-restricted, I'm quite curious to know what user I'm most stylometrically similar to. -- asilvering (talk) 17:51, 28 November 2024 (UTC)
- That would be LittlePuppers and LEvalyn. DatGuyTalkContribs 03:02, 29 November 2024 (UTC)
- Fascinating! One I've worked with, one I haven't, both AfC reviewers. Not bad. -- asilvering (talk) 06:14, 29 November 2024 (UTC)
- That would be LittlePuppers and LEvalyn. DatGuyTalkContribs 03:02, 29 November 2024 (UTC)
- Idk, the half dozen ARBPIA socks I recently reported at SPI were obvious af to me, as are several others I haven't reported yet. That may be because that particular sockfarm is easy to spot by its POV pushing and a few other habits; though I bet in other topic areas it's the same. WP:ARBECR helps because it forces the socks to make 500 edits minimum before they can start POV pushing, but still we have to let them edit for a while post-XC just to generate enough diffs to support an SPI filing. Software that combines tools like Masz and SimilarEditor, and does other kinds of similar analysis, could significantly reduce the amount of editor time required to identify and report them. Levivich (talk) 18:02, 28 November 2024 (UTC)
- I think it is possible, studies have demonstrated that it is possible, but it is true that having a sufficient number of samples is critical. Samples can be aggregated in some cases. There are several other important factors too. I have tried some techniques, and sometimes they work, or let's say they can sometimes produce results consistent with SPI results, better than random, but with plenty of false positives. It is also true that there are a number of detection countermeasures (that I won't describe) that are already employed by some bad actors that make detection harder. But I think the objective should be modest, to just move a bit in the right direction by detecting more ban evading accounts than are currently detected, or at least to find ways to reduce the size of the search space by providing ban evasion candidates. Taking the human out of the detection loop might take a while. Sean.hoyland (talk) 18:39, 28 November 2024 (UTC)
- If you mean it's never going to be possible to catch some sockpuppets—the best-hidden, cleverest, etc. ones—you're completely correct. But I'm guessing we could cut the amount of time SPI has to spend dramatically with just some basic checks. – Closed Limelike Curves (talk) 02:27, 29 November 2024 (UTC)
- I disagree. Empirically, the vast majority of time spent at SPI is not on finding possible socks, nor is it using the CheckUser tool on them, but rather it's the CU completed cases (of which there are currently 14 and I should probably stop slacking and get onto some) with non-definitive technical results waiting on an administrator to make the final determination on whether they're socks or not. Extension:SimilarUsers would concentrate various information that already exists (EIA, RoySmith's SPI tools) in one place, but I wouldn't say the accessibility of these tools is a cause of SPI backlog. An AI analysis tool to give an accurate magic number for likelihood? I'm anything but a Luddite, but still believe that's wishful thinking. DatGuyTalkContribs 03:02, 29 November 2024 (UTC)
- Something seems better than nothing in this context doesn't it? EIA and the Similarusers service don't provide an estimate of the significance of page intersections. An intersection on a page with few revisions or few unique actors or few pageviews etc. is very different from a page intersection on the Donald Trump page. That kind of information is probably something that could sometimes help, even just to evaluate the importance of intersection evidence presented at SPIs. It seems to me that any kind of assistance could help. And another thing about the number of edits is that too many samples can also present challenges related to noise, with signals getting smeared out, although the type of noise in a user's data can itself be a characteristic signal in some cases it seems. And if there are too few samples, you can generate synthetic samples based on the actual samples and inject them into spaces. Search strategy matters a lot. The space of everyone vs everyone is vast, so good luck finding potential matches in that space without a lot of compute, especially for diffs. But many socks inhabit relatively small subspaces of Wikipedia, at least in the 20%-ish of time (on average in PIA) they edit(war)/POV-push etc. in their topic of interest. So, choosing the candidate search space and search strategy wisely can make the problem much more tractable for a given topic area/subspace. Targeted fishing by picking a potential sock and looking for potential matches (the strategy used by the Similarusers service and CU I guess) is obviously a very different challenge than large-scale industrial fishing for socks in general. Sean.hoyland (talk) 04:08, 29 November 2024 (UTC)
- And to continue the whining about existing tools, EIA and the Similarusers service use a suboptimal strategy in my view. If the objective is page intersection information for a potential sock against a sockmaster, and a ban evasion source has employed n identified actors so far e.g. almost 50 accounts for Icewhiz, the source's revision data should be aggregated for the intersection. This is not difficult to do using the category graph and the logs. Sean.hoyland (talk) 04:25, 29 November 2024 (UTC)
- There is so much more that could be done with the software. EIA gives you page overlaps (and isn't 100% accurate at it), but it doesn't tell you:
- how many times the accounts made the same edits (tag team edit warring)
- how many times they voted in the same formal discussions (RfC, AfD, RM, etc) and whether they voted the same way or different (vote stacking)
- how many times they use the same language and whether they use unique phraseology
- whether they edit at the same times of day
- whether they edit on the same days
- whether account creation dates (or start-of-regular-editing dates) line up with when other socks were blocked
- whether they changed focus after reaching XC and to what extent (useful in any ARBECR area)
- whether they "gamed" or "rushed" to XC (same)
- All of this (and more) would be useful to see in a combined way, like a dashboard. It might make sense to restrict access to such compilations of data to CUs, and the software could also throw in metadata or subscriber info in there, too (or not), and it doesn't have to reduce it all into a single score like ORES, but just having this info compiled in one place would save editors the time of having to compile it manually. If the software auto-swept logs for this info and alerted humans to any "high scores" (however defined, eg "matches across multiple criteria"), it would probably not only reduce editor time but also increase sock discovery. Levivich (talk) 04:53, 29 November 2024 (UTC)
- This is like one of my favorite strategies for meetings. Propose multiple things, many of which are technically challenging, then just walk out of the meeting.
- The 'how many times the accounts made the same edits' is probably do-able because you can connect reverted revisions to the revisions that reverted them using json data in the database populated as part of the tagging system, look at the target state reverted to and whether the revision was an exact revert. ...or maybe not without computing diffs, having just looked at an article with a history of edit warring. Sean.hoyland (talk) 07:43, 29 November 2024 (UTC)
- I agree with Levivich that automated, privacy-protecting sock-detection is not a pipe dream. I proposed a system something like this in 2018, see also here, and more recently here. However, it definitely requires a bit of software development and testing. It also requires the community and the foundation devs or product folks to prioritize the idea. Andre🚐 02:27, 10 December 2024 (UTC)
- There is so much more that could be done with the software. EIA gives you page overlaps (and isn't 100% accurate at it), but it doesn't tell you:
- I disagree. Empirically, the vast majority of time spent at SPI is not on finding possible socks, nor is it using the CheckUser tool on them, but rather it's the CU completed cases (of which there are currently 14 and I should probably stop slacking and get onto some) with non-definitive technical results waiting on an administrator to make the final determination on whether they're socks or not. Extension:SimilarUsers would concentrate various information that already exists (EIA, RoySmith's SPI tools) in one place, but I wouldn't say the accessibility of these tools is a cause of SPI backlog. An AI analysis tool to give an accurate magic number for likelihood? I'm anything but a Luddite, but still believe that's wishful thinking. DatGuyTalkContribs 03:02, 29 November 2024 (UTC)
- Oh, fun! Too bad it's CU-restricted, I'm quite curious to know what user I'm most stylometrically similar to. -- asilvering (talk) 17:51, 28 November 2024 (UTC)
- Comment. For some time I have vehemnently suspected that this site is crawling with massive numbers of sockpuppets, that the community seems to be unable or unwilling to recognise probable sockpuppets for what they are, and it is not feasible to send them to SPI one at a time. I see a large number of accounts that are sleepers, or that have low edit counts, trying to do things that are controversial or otherwise suspicious. I see them showing up at discussions in large numbers and in quick succession, and offering !votes consist of interpretations of our policies and guidelines that may not reflect consensus, or other statements that may not be factually accurate.
- I think the solution is simple: when closing community discussions, admins should look at the edit count of each !voter when determining how much weight to give his !vote. The lower the edit count, the greater the level of sleeper behaviour, and the more controversial the subject of the discussion is amongst the community, the less weight should be given to !vote.
- For example, if an account with less than one thousand edits !votes in a discussion about 16th century Tibetan manuscripts, we may well be able to trust that !vote, because the community does not care about such manuscripts. But if the same account !votes on anything connected with "databases" or "lugstubs", we should probably give that !vote very little weight, because that was the subject of a massive dispute amongst the community, and any discussion on that subject is not particulary unlikely to be crawling with socks on both sides. The feeling is that, if you want to be taken seriously in such a controversial discussion, you need to make enough edits to prove that you are a real person, and not a sock. James500 (talk) 15:22, 12 December 2024 (UTC)
- The site presumably has a large number of unidentified sockpuppets. As for the identified ban evading accounts, accounts categorized or logged as socks, if you look at 2 million randomly selected articles for the 2023-10-07 to 2024-10-06 year, just under 2% of the revisions are by ban evading actors blocked for sockpuppetry (211,546 revisions out of 10,732,361). A problem with making weight dependent on edit count is that the edit count number does not tell you anything about the probability that an account is a sock. Some people use hundreds of disposable accounts, making just a few edits with each account. Others stick around and make thousands of edits before they are detected. Also, Wikipedia provides plenty of tools that people can use to rapidly increase their edit count. Sean.hoyland (talk) 16:12, 12 December 2024 (UTC)
- I strongly oppose any idea of mass-CUing any group of users, and I'm pretty sure the WMF does too. This isn't the right way to fight sockpuppets. QuicoleJR (talk) 14:35, 15 December 2024 (UTC)
- Can I ask why? Is it a privacy-based concern? IPs are automatically collected and stored for 90 days, and maybe for years in the backups, regardless of CUs. That's a 90 day window that a machine could use to do something with them without anyone running a CU and without anyone having to see what the machine sees. Sean.hoyland (talk) 15:05, 15 December 2024 (UTC)
- Primarily privacy concerns, as well as concerns about false positives. A lot of people here probably share an IP with other editors without even knowing it. I also would like to maintain my personal privacy, and I know many other editors would too. There are other methods of fighting sockpuppets that don't have as much collateral damage, and we should pursue those instead. QuicoleJR (talk) 15:16, 17 December 2024 (UTC)
- Also, it wouldn't even work on some sockpuppets, because IP info is only retained for 90 days, so a blocked editor could just wait out the 90 days and then return with a new account. QuicoleJR (talk) 15:19, 17 December 2024 (UTC)
- Can I ask why? Is it a privacy-based concern? IPs are automatically collected and stored for 90 days, and maybe for years in the backups, regardless of CUs. That's a 90 day window that a machine could use to do something with them without anyone running a CU and without anyone having to see what the machine sees. Sean.hoyland (talk) 15:05, 15 December 2024 (UTC)
- @Levivich—one situation where I think we could pull a lot of data, and probably detect tons of sockpuppets, is !votes like RfAs and RfCs. Those have a lot of data, in addition to a very strong incentive for socking—you'd expect to see a bimodal distribution where most accounts have moderately-correlated views, but a handful have extremely strong-correlations (always !voting the same way), more than could plausibly happen by chance or by overlapping views. For accounts in the latter group, we'd have strong grounds to suspect collusion/canvassing or socking.
- RfAs are already in a very nice machine-readable format. RfCs aren't, but most could easily be made machine-readable (by adopting a few standardized templates). We could also build a tool for semi-automated recoding of old RfCs to get more data. – Closed Limelike Curves (talk) 18:56, 16 December 2024 (UTC)
- Would that data help with the general problem? If there are a lot of socks on an RfA, I'd expect that to be picked up by editors. Those are very well-attended. The same may apply to many RfCs. Perhaps the less well-attended ones might be affected, but the main challenge is article edits, which will not be similarly structured. CMD (talk) 19:13, 16 December 2024 (UTC)
Would that data help with the general problem? If there are a lot of socks on an RfA, I'd expect that to be picked up by editors.
- Given we've had situations of sockpuppets being made admins themselves, I'm not too sure of this myself. If someone did create a bunch of socks, as some people have alleged in this thread, it'd be weird of them not to use those socks to influence policy decisions. I'm pretty skeptical, but I do think investigating would be a good idea (if nothing else because of how important it is—even the possibility of substantial RfA/RfC manipulation is quite bad, because it undermines the whole idea of consensus). – Closed Limelike Curves (talk) 21:04, 16 December 2024 (UTC)
- RFAs, RfCs, RMs, AfDs, and arbcom elections. Levivich (talk) 23:11, 17 December 2024 (UTC)
- Would that data help with the general problem? If there are a lot of socks on an RfA, I'd expect that to be picked up by editors. Those are very well-attended. The same may apply to many RfCs. Perhaps the less well-attended ones might be affected, but the main challenge is article edits, which will not be similarly structured. CMD (talk) 19:13, 16 December 2024 (UTC)
What do we do with this information?
[edit]I think we've put the cart before the horse here a bit. While we've established it's possible to detect most sockpuppets automatically—and the WMF is already working on it—it's not clear what this would actually achieve, because having multiple accounts isn't against the rules. I think we'd need to establish a set of easy-to-enforce boundaries for people using multiple accounts. My proposal is to keep it simple—two accounts controlled by the same person can't edit the same page (or participate in the same discussion) without disclosing they're the same editor.– Closed Limelike Curves (talk) 04:41, 14 December 2024 (UTC)
- This is already covered by WP:LEGITSOCK I think. Andre🚐 05:03, 14 December 2024 (UTC)
- And as there are multiple legitimate ways to disclose, not all of which are machine readable, any automatically generated list is going to need human review. Thryduulf (talk) 10:13, 14 December 2024 (UTC)
- Yes, that's definitely the case, an automatic sock detection should probably never be an autoblock, or at least not unless there is a good reason in that specific circumstance, like a well-trained filter for a specific LTA. Having the output of automatic sock detection should still be restricted to CU/OS or another limited user group who can be trusted to treat possible user-privacy-related issues with discretion, and have gone through the appropriate legal rigmarole. There could also be some false positives or unusual situations when piloting a program like this. For example, I've seen dynamic IPs get assigned to someone else after a while, which is unlikely but not impossible depending on how an ISP implements DHCP, though I guess collisions become less common with IPV6. Or if the fingerprinting is implemented with a lot of datapoints to reduce the likelihood of false positives. Andre🚐 10:31, 14 December 2024 (UTC)
- I think we are probably years away from being able to rely on autonomous agents to detect and block socks without a human in the loop. For now, people need as much help as they can get to identify ban evasion candidates. Sean.hoyland (talk) 10:51, 14 December 2024 (UTC)
or at least not unless there is a good reason in that specific circumstance,
- Yep, basically I'm saying we need to define "good reason". The obvious situation is automatically blocking socks of blocked accounts. I also think we should just automatically prevent detected socks from editing the same page (ideally make it impossible, to keep it from being done accidentally). – Closed Limelike Curves (talk) 17:29, 14 December 2024 (UTC)
- Yes, that's definitely the case, an automatic sock detection should probably never be an autoblock, or at least not unless there is a good reason in that specific circumstance, like a well-trained filter for a specific LTA. Having the output of automatic sock detection should still be restricted to CU/OS or another limited user group who can be trusted to treat possible user-privacy-related issues with discretion, and have gone through the appropriate legal rigmarole. There could also be some false positives or unusual situations when piloting a program like this. For example, I've seen dynamic IPs get assigned to someone else after a while, which is unlikely but not impossible depending on how an ISP implements DHCP, though I guess collisions become less common with IPV6. Or if the fingerprinting is implemented with a lot of datapoints to reduce the likelihood of false positives. Andre🚐 10:31, 14 December 2024 (UTC)
- And as there are multiple legitimate ways to disclose, not all of which are machine readable, any automatically generated list is going to need human review. Thryduulf (talk) 10:13, 14 December 2024 (UTC)
Revise Wikipedia:INACTIVITY
[edit]Point 1 of Procedural removal for inactive administrators which currently reads "Has made neither edits nor administrative actions for at least a 12-month period" should be replaced with "Has made no administrative actions for at least a 12-month period". The current wording of 1. means that an Admin who takes no admin actions keeps the tools provided they make at least a few edits every year, which really isn't the point. The whole purpose of adminship is to protect and advance the project. If an admin isn't using the tools then they don't need to have them. Mztourist (talk) 07:47, 4 December 2024 (UTC)
Endorsement/Opposition (Admin inactivity removal)
[edit]- Support as proposer. Mztourist (talk) 07:47, 4 December 2024 (UTC)
- Oppose - this would create an unnecessary barrier to admins who, for real life reasons, have limited engagement for a bit. Asking the tools back at BN can feel like a faff. Plus, logged admin activity is a poor guide to actual admin activity. In some areas, maybe half of actions aren't logged? —Femke 🐦 (talk) 19:17, 4 December 2024 (UTC)
- Oppose. First, not all admin actions are logged as such. One example which immediately comes to mind is declining an unblock request. In the logs, that's just a normal edit, but it's one only admins are permitted to make. That aside, if someone has remained at least somewhat engaged with the project, they're showing they're still interested in returning to more activity one day, even if real-life commitments prevent them from it right now. We all have things come up that take away our available time for Wikipedia from time to time, and that's just part of life. Say, for example, someone is currently engaged in a PhD program, which is a tremendously time-consuming activity, but they still make an edit here or there when they can snatch a spare moment. Do we really want to discourage that person from coming back around once they've completed it? Seraphimblade Talk to me 21:21, 4 December 2024 (UTC)
- We could declare specific types of edits which count as admin actions despite being mere edits. It should be fairly simple to write a bot which checks if an admin has added or removed specific texts in any edit, or made any of specific modifications to pages. Checking for protected edits can be a little harder (we need to check for protection at the time of edit, not for the time of the check), but even this can be managed. Edits to pages which match specific regular expression patterns should be trivial to detect. Animal lover |666| 11:33, 9 December 2024 (UTC)
- Oppose There's no indication that this is a problem needs fixing. ⇒SWATJester Shoot Blues, Tell VileRat! 00:55, 5 December 2024 (UTC)
- Support Admins who don't use the tools should not have the tools. * Pppery * it has begun... 03:55, 5 December 2024 (UTC)
- Oppose While I have never accepted "not all admin actions are logged" as a realistic reason for no logged actions in an entre year, I just don't see what problematic group of admins this is in response to. Previous tweaks to the rules were in response to admins that seemed to be gaming the system, that were basically inactive and when they did use the tools they did it badly, etc. We don't need a rule that ins't pointed a provable, ongoing problem. Just Step Sideways from this world ..... today 19:19, 8 December 2024 (UTC)
- Oppose If an admin is still editing, it's not unreasonable to assume that they are still up to date with policies, community norms etc. I see no particular risk in allowing them to keep their tools. Scribolt (talk) 19:46, 8 December 2024 (UTC)
- Oppose: It feels like some people are trying to accelerate admin attrition and I don't know why. This is a solution in search of a problem. Gnomingstuff (talk) 07:11, 10 December 2024 (UTC)
- Oppose Sure there is a problem, but the real problem I think is that it is puzzling why they are still admins. Perhaps we could get them all to make a periodic 'declaration of intent' or some such every five years that explains why they want to remain an admin. Alanscottwalker (talk) 19:01, 11 December 2024 (UTC)
- Oppose largely per scribolt. We want to take away mops from inactive accounts where there is a risk of them being compromised, or having got out of touch with community norms, this proposal rather targets the admins who are active members of the community. Also declining incorrect deletion tags and AIV reports doesn't require the use of the tools, doesn't get logged but is also an important thing for admins to do. ϢereSpielChequers 07:43, 15 December 2024 (UTC)
- Oppose. What is the motivation for this frenzy to make more hoops for admins to jump through and use not jumping through hoops as an excuse to de-admin them? What problem does it solve? It seems counterproductive and de-inspiring when the bigger issue is that we don't have enough new admins. —David Eppstein (talk) 07:51, 17 December 2024 (UTC)
- Oppose Some admin actions aren't logged, and I also don't see why this is necessary. Worst case scenario, we have WP:RECALL. QuicoleJR (talk) 15:25, 17 December 2024 (UTC)
- Oppose I quite agree with David Eppstein's sentiment. What's with the rush to add more hoops? Is there some problem with the admin corps that we're not adequately dealing with? Our issue is that we have too few admins, not that we have too many. CaptainEek Edits Ho Cap'n!⚓ 23:20, 22 December 2024 (UTC)
Discussion (Admin inactivity removal)
[edit]- Making administrative actions can be helpful to show that the admin is still up-to-date with community norms. We could argue that if someone is active but doesn't use the tools, it isn't a big issue whether they have them or not. Still, the tools can be requested back following an inactivity desysop, if the formerly inactive admin changes their mind and wants to make admin actions again. For now, I don't see any immediate issues with this proposal. Chaotic Enby (talk · contribs) 08:13, 4 December 2024 (UTC)
- Looking back at previous RFCs, in 2011 the reasoning was to reduce the attack surface for inactive account takeover, and in 2022 it was about admins who haven't been around enough to keep up with changing community norms. What's the justification for this besides "use it or lose it"? Further, we already have a mechanism (from the 2022 RFC) to account for admins who make a few edits every year. Anomie⚔ 12:44, 4 December 2024 (UTC)
- I also note that not all admin actions are logged. Logging editing through full protection requires abusing the Edit Filter extension. Reviewing of deleted content isn't logged at all. Who will decide whether an admin's XFD "keep" closures are really WP:NACs or not? Do adminbot actions count for the operator? There are probably more examples. Currently we ignore these edge cases since the edits will probably also be there, but now if we can desysop someone who made 100,000 edits in the year we may need to consider them. Anomie⚔ 12:44, 4 December 2024 (UTC)
- I had completely forgotten that many admin actions weren't logged (and thus didn't "count" for activity levels), that's actually a good point (and stops the "community norms" arguments as healthy levels of community interaction can definitely be good evidence of that). And, since admins desysopped for inactivity can request the tools back, an admin needing the bit but not making any logged actions can just ask for it back. At this point, I'm not sure if there's a reason to go through the automated process of desysopping/asking for resysop at all, rather than just politely ask the admin if they still need the tools.I'm still very neutral on this by virtue of it being a pretty pointless and harmless process either way (as, again, there's nothing preventing an active admin desysopped for "inactivity" from requesting the tools back), but I might lean oppose just so we don't add a pointless process for the sake of it. Chaotic Enby (talk · contribs) 15:59, 4 December 2024 (UTC)
- To me this comes down to whether the community considers it problematic for an admin to have tools they aren't using. Since it's been noted that not all admin actions are logged, and an admin who isn't using their tools also isn't causing any problems, I'm not sure I see a need to actively remove the tools from an inactive admin; in a worst-case scenario, isn't this encouraging an admin to (potentially mis-)use the tools solely in the interest of keeping their bit? There also seems to be somewhat of a bad-faith assumption to the argument that an admin who isn't using their tools may also be falling behind on community norms. I'd certainly like to hope that if I was an admin who had been inactive that I would review P&G relevant to any admin action I intended to undertake before I executed. DonIago (talk) 15:14, 4 December 2024 (UTC)
- As I have understood it, the original rationale for desysopping after no activity for a year was the perception that an inactive account was at higher danger of being hijacked. It had nothing to do with how often the tools were being used, and presumably, if the admin was still editing, even if not using the tools, the account was less likely to be hijacked. - Donald Albury 22:26, 4 December 2024 (UTC)
- And also, if the account of an active admin was hijacked, both the account owner and those they interact with regularly would be more likely to notice the hijacking. The sooner a hijacked account is identified as hijacked, the sooner it is blocked/locked which obviously minimises the damage that can be done. Thryduulf (talk) 00:42, 5 December 2024 (UTC)
- I was not aware that not all admin actions are logged, obviously they should all be correctly logged as admin actions. If you're an Admin you should be doing Admin stuff, if not then you obviously don't need the tools. If an Admin is busy IRL then they can either give up the tools voluntarily or get desysopped for inactivity. The "Asking the tools back at BN can feel like a faff." isn't a valid argument, if an Admin has been desysopped for inactivity then getting the tools back should be "a faff". Regarding the comment that "There's no indication that this is a problem needs fixing." the problem is Admins who don't undertake admin activity, don't stay up to date with policies and norms, but don't voluntarily give up the tools. The 2022 change was about total edits over 5 years, not specifically admin actions and so didn't adequately address the issue. Mztourist (talk) 03:23, 5 December 2024 (UTC)
obviously they should all be correctly logged as admin actions
- how would you log actions that are administrative actions due to context/requiring passive use of tools (viewing deleted content, etc.) rather than active use (deleting/undeleting, blocking, and so on)/declining requests where accepting them would require tool use? (e.g. closing various discussions that really shouldn't be NAC'd, reviewing deleted content, declining page restoration) Maybe there are good ways of doing that, but I haven't seen any proposed the various times this subject came up. Unless and until "soft" admin actions are actually logged somehow, "editor has admin tools and continues to engage with the project by editing" is the closest, if very imperfect, approximation to it we have, with criterion 2 sort-of functioning to catch cases of "but these specific folks edit so little over a prolonged time that it's unlikely they're up-to-date and actively engaging in soft admin actions". (I definitely do feel criterion 2 could be significantly stricter, fwiw) AddWittyNameHere 05:30, 5 December 2024 (UTC)- Not being an Admin I have no idea how their actions are or aren't logged, but is it a big ask that Admins perform at least a few logged Admin actions in a year? The "imperfect, approximation" that "editor has admin tools and continues to engage with the project by editing" is completely inadequate to capture Admin inactivity. Mztourist (talk) 07:06, 6 December 2024 (UTC)
- Why is it "completely inadequate"? Thryduulf (talk) 10:32, 6 December 2024 (UTC)
- I've been a "hawk" regarding admin activity standards for a very long time, but this proposal comes off as half-baked. The rules we have now are the result of careful consideration and incremental changes aimed at specific, provable issues with previous standards. While I am not a proponent of "not all actions are logged" as a blanket excuse for no logged actions in several years, it is feasible that an admin could be otherwise fully engaged with the community while not having any logged actions. We haven't been having trouble with admins who would be removed by this, so where's the problem? Just Step Sideways from this world ..... today 19:15, 8 December 2024 (UTC)
- Why is it "completely inadequate"? Thryduulf (talk) 10:32, 6 December 2024 (UTC)
- Not being an Admin I have no idea how their actions are or aren't logged, but is it a big ask that Admins perform at least a few logged Admin actions in a year? The "imperfect, approximation" that "editor has admin tools and continues to engage with the project by editing" is completely inadequate to capture Admin inactivity. Mztourist (talk) 07:06, 6 December 2024 (UTC)
"Blur all images" switch
[edit]Although i know that WP:NOTCENSORED, i propose that the Vector 2022 and Minerva Neue skins (+the Wikipedia mobile apps) have a "blur all images" toggle that blurs all the images on all pages (requiring clicking on them to view them), which simplifies the process of doing HELP:NOSEE as that means:
- You don't need to create an account to hide all images.
- You don't need any complex JavaScript or CSS installation procedures. Not even browser extensions.
- You can blur all images in the mobile apps, too.
- It's all done with one push of a button. No extra steps needed.
- Blurring all images > hiding all images. The content of a blurred image could be easily memorized, while a completely hidden image is difficult to compare to the others.
And it shouldn't be limited to just Wikipedia. This toggle should be available on all other WMF projects and MediaWiki-powered wikis, too. 67.209.128.126 (talk) 15:26, 5 December 2024 (UTC)
- Sounds good. Damon will be thrilled. Martinevans123 (talk) 15:29, 5 December 2024 (UTC)
- Sounds like something I can try to make a demo of as a userscript! Chaotic Enby (talk · contribs) 15:38, 5 December 2024 (UTC)
- User:Chaotic Enby/blur.js should do the job, although I'm not sure how to deal with the Page Previews extension's images. Chaotic Enby (talk · contribs) 16:16, 5 December 2024 (UTC)
- Wow, @Chaotic Enby, is that usable on all skins/browsers/devices? If so, we should be referring people to it from everywhere instead of the not-very-helpful WP:NOSEE, which I didn't even bother to try to figure out. Valereee (talk) 15:00, 17 December 2024 (UTC)
- I haven't tested it beyond my own setup, although I can't see reasons why it wouldn't work elsewhere. However, there are two small bugs I'm not sure how to fix: when loading a new page, the images briefly show up for a fraction of a second before being blurred; and the images in Page Previews aren't blurred (the latter, mostly because I couldn't get the html code for the popups). Chaotic Enby (talk · contribs) 16:57, 17 December 2024 (UTC)
- Ah, yes, I see both of those. Probably best to get at least the briefly-showing bug fixed before recommending it generally. The page previews would be good to fix but may be less of an issue for recommending generally, since people using that can be assumed to know how to turn it off. Valereee (talk) 18:28, 17 December 2024 (UTC)
- I don't think there's a way to get around when the Javascript file is loaded and executed. I think users will have to modify their personal CSS file to blur images on initial load, much like the solution described at Help:Options to hide an image § Hide all images until they are clicked on. isaacl (talk) 18:41, 17 December 2024 (UTC)
- Ah, yes, I see both of those. Probably best to get at least the briefly-showing bug fixed before recommending it generally. The page previews would be good to fix but may be less of an issue for recommending generally, since people using that can be assumed to know how to turn it off. Valereee (talk) 18:28, 17 December 2024 (UTC)
- @Valereee -- the issue with a script would be as follows:
- Even for logged-in users, user scripts are a moderate barrier to install (digging through settings, or worse still, having to copy-paste to the JS/CSS user pages).
- The majority of readers do not have an account, and the overwhelming majority of all readers make zero edits. For many people, it's too much of a hassle to sign up (or they can't remember their password, or a number of other reasons etc, etc)
- What all readers and users have, though, is this menu:
- I say instead of telling the occasional IP or user who complains to install a script (there are probably many more people who object to NOTCENSORED, but don't want to or don't know how to voice objections), we could add the option to replace all images with a placeholder (or blur) and perhaps also an option to increase thumbnail size.
- On the image blacklist aspect, doesn't Anomie have a script that hides potentially offensive images? I've not a clue how it works, but perhaps it could be added to the appearance menu (I don't support this myself, for a number of reasons)
- JayCubby 18:38, 17 December 2024 (UTC)
- That's User:Anomie/hide-images, which is already listed on WP:NOSEE. I wrote it a long time ago as a joke for one of these kinds of discussions: it does very well at hiding all "potentially offensive" images because it hides all images. But people who want to have to click to see any images found it useful enough to list it on WP:NOSEE. Anomie⚔ 22:52, 17 December 2024 (UTC)
- Out of curiosity, how does it filter for potentially offensive images? The code at user:Anomie/hide-images.js seems rather minimal (as I write this, I realize it may work by hiding all images, so I may have answered my own question). JayCubby 22:58, 17 December 2024 (UTC)
because it hides all images
isaacl (talk) 23:11, 17 December 2024 (UTC)
- Out of curiosity, how does it filter for potentially offensive images? The code at user:Anomie/hide-images.js seems rather minimal (as I write this, I realize it may work by hiding all images, so I may have answered my own question). JayCubby 22:58, 17 December 2024 (UTC)
- That's User:Anomie/hide-images, which is already listed on WP:NOSEE. I wrote it a long time ago as a joke for one of these kinds of discussions: it does very well at hiding all "potentially offensive" images because it hides all images. But people who want to have to click to see any images found it useful enough to list it on WP:NOSEE. Anomie⚔ 22:52, 17 December 2024 (UTC)
- I haven't tested it beyond my own setup, although I can't see reasons why it wouldn't work elsewhere. However, there are two small bugs I'm not sure how to fix: when loading a new page, the images briefly show up for a fraction of a second before being blurred; and the images in Page Previews aren't blurred (the latter, mostly because I couldn't get the html code for the popups). Chaotic Enby (talk · contribs) 16:57, 17 December 2024 (UTC)
- Wow, @Chaotic Enby, is that usable on all skins/browsers/devices? If so, we should be referring people to it from everywhere instead of the not-very-helpful WP:NOSEE, which I didn't even bother to try to figure out. Valereee (talk) 15:00, 17 December 2024 (UTC)
- User:Chaotic Enby/blur.js should do the job, although I'm not sure how to deal with the Page Previews extension's images. Chaotic Enby (talk · contribs) 16:16, 5 December 2024 (UTC)
- Will be a problem for non registered users, as the default would clearly to leave images in blurred for them. — Masem (t) 15:40, 5 December 2024 (UTC)
- Better show all images by default for all users. If you clear your cookies often you can simply change the toggle every time. 67.209.128.132 (talk) 00:07, 6 December 2024 (UTC)
- That's my point: if you are unregistered, you will see whatever the default setting is (which I assume will be unblurred, which might lead to more complaints). We had similar problems dealing with image thumbnail sizes, a setting that unregistered users can't adjust. Masem (t) 01:10, 6 December 2024 (UTC)
- I'm confused about how this would lead to more complaints. Right now, logged-out users see every image without obfuscation. After this toggle rolls out, logged-out users would still see every image without obfuscation. What fresh circumstance is leading to new complaints? ꧁Zanahary꧂ 07:20, 12 December 2024 (UTC)
- Well, we'd be putting in an option to censor, but not actively doing it. People will have issues with that. Lee Vilenski (talk • contribs) 10:37, 12 December 2024 (UTC)
- Isn't the page Help:Options to hide an image "an option to censor" we've put in? Gråbergs Gråa Sång (talk) 11:09, 12 December 2024 (UTC)
- Well, we'd be putting in an option to censor, but not actively doing it. People will have issues with that. Lee Vilenski (talk • contribs) 10:37, 12 December 2024 (UTC)
- I'm confused about how this would lead to more complaints. Right now, logged-out users see every image without obfuscation. After this toggle rolls out, logged-out users would still see every image without obfuscation. What fresh circumstance is leading to new complaints? ꧁Zanahary꧂ 07:20, 12 December 2024 (UTC)
- That's my point: if you are unregistered, you will see whatever the default setting is (which I assume will be unblurred, which might lead to more complaints). We had similar problems dealing with image thumbnail sizes, a setting that unregistered users can't adjust. Masem (t) 01:10, 6 December 2024 (UTC)
- Better show all images by default for all users. If you clear your cookies often you can simply change the toggle every time. 67.209.128.132 (talk) 00:07, 6 December 2024 (UTC)
- I'm not opposed to this, if it can be made to work, fine. Gråbergs Gråa Sång (talk) 19:11, 5 December 2024 (UTC)
- What would be the goal of a blur all images option? It seems too tailored. But a "hide all images" could be suitable. EEpic (talk) 06:40, 11 December 2024 (UTC)
- Simply removing them might break page layout, so images could be replaced with an equally sized placeholder. JayCubby 13:46, 13 December 2024 (UTC)
Could there be an option to simply not load images for people with a low-bandwidth connection or who don't want them? Travellers & Tinkers (talk) 16:36, 5 December 2024 (UTC)
- I agree. This way, the options would go as
- Show all images
- Blur all images
- Hide all images
- It would honestly be better with your suggestion. 67.209.128.132 (talk) 00:02, 6 December 2024 (UTC)
- Of course, it will do nothing to appease the "These pics shouldn't be on WP at all" people. Gråbergs Gråa Sång (talk) 06:52, 6 December 2024 (UTC)
- “Commons be thataway” is what we should tell them Dronebogus (talk) 18:00, 11 December 2024 (UTC)
- I suggest that the "hide all images" display file name if possible. Between file name and caption (which admittedly are often similar, but not always), there should be sufficient clue whether an image will be useful (and some suggestion, but not reliably so, if it may offend a sensibility.) -- Nat Gertler (talk) 17:59, 11 December 2024 (UTC)
- Of course, it will do nothing to appease the "These pics shouldn't be on WP at all" people. Gråbergs Gråa Sång (talk) 06:52, 6 December 2024 (UTC)
- For low-bandwidth or expensive bandwidth -- many folks are on mobile plans which charge for bandwidth. -- Nat Gertler (talk) 14:28, 11 December 2024 (UTC)
Regarding not limiting image management choices to Wikipedia: that's why it's better to manage this on the client side. Anyone needing to limit their bandwidth usage, or to otherwise decide individually on whether or not to load each photo, will likely want to do this generally in their web browsing. isaacl (talk) 18:43, 6 December 2024 (UTC)
- Definitely a browser issue. You can get plug-ins for Chrome right now that will do exactly this, and there's no need for Wikipedia/Mediawiki to implent anything. — The Anome (talk) 18:48, 6 December 2024 (UTC)
I propose something a bit different: all images on the bad images list can only be viewed with a user account that has been verified to be over 18 with government issued ID. I say this because in my view there is absolutely no reason for a minor to view it. Jayson (talk) 23:41, 8 December 2024 (UTC)
- Well, that means readers will be forced to not only create an account, but also disclose sensitive personal information, just to see encyclopedic images. That is pretty much the opposite of a free encyclopedia. Chaotic Enby (talk · contribs) 23:44, 8 December 2024 (UTC)
- I can support allowing users to opt to blu4 or hide some types of images, but this needs to be an opt-in only. By default, show all images. And I'm also opposed to any technical restriction which requires self-identification to overcome, except for cases where the Foundation deems it necessary to protect private information (checkuser, oversight-level hiding, or emails involving private information). Please also keep in mind that even if a user sends a copy of an ID which indicates the individual person's age, there is no way to verify that it was the user's own ID whuch had been sent. Animal lover |666| 11:25, 9 December 2024 (UTC)
- Also, the bad images list is a really terrible standard. Around 6% of it is completely harmless content that happened to be abused. And even some of the “NSFW” images are perfectly fine for children to view, for example File:UC and her minutes-old baby.jpg. Are we becoming Texas or Florida now? Dronebogus (talk) 18:00, 11 December 2024 (UTC)
- You could've chosen a much better example like dirty toilet or the flag of Hezbollah... Traumnovelle (talk) 19:38, 11 December 2024 (UTC)
- Well, yes, but I rank that as “harmless”. I don’t know why anyone would consider a woman with her newborn baby so inappropriate for children it needs to be censored like hardcore porn. Dronebogus (talk) 14:53, 12 December 2024 (UTC)
- The Hezbollah flag might be blacklisted because it's copyrighted, but placed in articles by uninformed editors (though one of JJMC89's bots automatically removes NFC files from pages). We have File:InfoboxHez.PNG for those uses. JayCubby 16:49, 13 December 2024 (UTC)
- You could've chosen a much better example like dirty toilet or the flag of Hezbollah... Traumnovelle (talk) 19:38, 11 December 2024 (UTC)
- Also, the bad images list is a really terrible standard. Around 6% of it is completely harmless content that happened to be abused. And even some of the “NSFW” images are perfectly fine for children to view, for example File:UC and her minutes-old baby.jpg. Are we becoming Texas or Florida now? Dronebogus (talk) 18:00, 11 December 2024 (UTC)
- I can support allowing users to opt to blu4 or hide some types of images, but this needs to be an opt-in only. By default, show all images. And I'm also opposed to any technical restriction which requires self-identification to overcome, except for cases where the Foundation deems it necessary to protect private information (checkuser, oversight-level hiding, or emails involving private information). Please also keep in mind that even if a user sends a copy of an ID which indicates the individual person's age, there is no way to verify that it was the user's own ID whuch had been sent. Animal lover |666| 11:25, 9 December 2024 (UTC)
- I support this proposal. It’s a very clean compromise between the “think of the children” camp and the “freeze peach camp”. Dronebogus (talk) 17:51, 11 December 2024 (UTC)
- Let me dox myself so I can view this image. Even Google image search doesn't require something this stringent. Lee Vilenski (talk • contribs) 19:49, 11 December 2024 (UTC)
- oppose should not be providing toggles to censor. ValarianB (talk) 15:15, 12 December 2024 (UTC)
- What about an option to disable images entirely? It might use significantly less data. JayCubby 02:38, 13 December 2024 (UTC)
- This is an even better idea as an opt-in toggle than the blur one. Load no images by default, and let users click a button to load individual images. That has a use beyond sensitivity. ꧁Zanahary꧂ 02:46, 13 December 2024 (UTC)
- Yes I like that idea even better. I think in any case we should use alt text to describe the image so people don’t have to play Russian roulette based on potentially vague or nonexistent descriptions, i.e. without alt text an ignorant reader would have no idea the album cover for Virgin Killer depicts a nude child in a… questionable pose. Dronebogus (talk) 11:42, 13 December 2024 (UTC)
- An option to replace images with alt text seems both much more useful and much more neutral as an option. There are technical reasons why a user might want to not load images (or only selectively load them based on the description), so that feels more like a neutral interface setting. An option to blur images by default sends a stronger message that images are dangerous.--Trystan (talk) 16:24, 13 December 2024 (UTC)
- Also it'd negate the bandwidth savings somewhat (assuming an image is displayed as a low pixel-count version). I'm of the belief that Wikipedia should have more features tailored to the reader. JayCubby 16:58, 13 December 2024 (UTC)
- At the very least, add a filter that allows you to block all images on the bad image list, specifically that list and those images. To the people who say you shouldnt have to give up personal info, I say that we should go the way Roblox does. Seems a bit random, hear me out: To play 17+ games, you need to verify with gov id, those games have blood alcohol, unplayable gambling and "romance". I say that we do the same. Giving up personal info to view bad things doesn't seem so bad to me. Jayson (talk) 03:44, 15 December 2024 (UTC)
- Building up a database of people who have applied to view bad things on a service that's available in restrictive regimes sounds like a way of putting our users in danger. -- Nat Gertler (talk) 07:13, 15 December 2024 (UTC)
- Roblox =/= Wikipedia. I don’t know why I have to say this, nor did I ever think I would. And did you read what I already said about the “bad list”? Do you want people to have to submit their ID to look at poop, a woman with her baby, the Hezbollah flag, or graffiti? How about we age-lock articles about adult topics next? Dronebogus (talk) 15:55, 15 December 2024 (UTC)
- Ridiculous. Lee Vilenski (talk • contribs) 16:21, 15 December 2024 (UTC)
- So removing a significant thing that makes Wikipedia free is worth preventing underaged users from viewing certain images? I wouldn't say that would be a good idea if we want to make Wikipedia stay successful. If a reader wants to read an article, they should expect to see images relevant to the topic. This includes topics that are usually considered NSFW like Graphic violence, Sexual intercourse, et cetera. If a person willingly reads an article about an NSFW topic, they should acknowledge that they would see topic-related NSFW images. ZZZ'S 16:45, 15 December 2024 (UTC)
- What "bad things"? You haven't listed any. --User:Khajidha (talk) (contributions) 15:57, 17 December 2024 (UTC)
- This is moot. Requiring personal information to use Wikipedia isn't something this community even has the authority to do. Valereee (talk) 16:23, 17 December 2024 (UTC)
- At the very least, add a filter that allows you to block all images on the bad image list, specifically that list and those images. To the people who say you shouldnt have to give up personal info, I say that we should go the way Roblox does. Seems a bit random, hear me out: To play 17+ games, you need to verify with gov id, those games have blood alcohol, unplayable gambling and "romance". I say that we do the same. Giving up personal info to view bad things doesn't seem so bad to me. Jayson (talk) 03:44, 15 December 2024 (UTC)
- Also it'd negate the bandwidth savings somewhat (assuming an image is displayed as a low pixel-count version). I'm of the belief that Wikipedia should have more features tailored to the reader. JayCubby 16:58, 13 December 2024 (UTC)
- Yes, if this happens it should be through a disable all images toggle, not an additional blur. There have been times that would have been very helpful for me. CMD (talk) 03:52, 15 December 2024 (UTC)
- This is an even better idea as an opt-in toggle than the blur one. Load no images by default, and let users click a button to load individual images. That has a use beyond sensitivity. ꧁Zanahary꧂ 02:46, 13 December 2024 (UTC)
- Support the proposal as written. I'd imagine WMF can add a button below the already-existing accessibility options. People have different cultural, safety, age, and mental needs to block certain images. Ca talk to me! 13:04, 15 December 2024 (UTC)
- I'd support an option to replace images with the alt text, as long as all you had to do to see a hidden image was a single click/tap (we'd need some fallback for when an image has no alt text, but that's a minor issue). Blurring images doesn't provide any significant bandwidth benefits and could in some circumstances cause problems (some blurred innocent images look very similar to some blurred blurred images that some people regard as problematic, e.g. human flesh and cooked chicken). I strongly oppose anything that requires submitting personal information of any sort in order to see images per NatGertler. Thryduulf (talk) 14:15, 15 December 2024 (UTC)
- Fallback for alt text could be filename, which is generally at least slightly descriptive. -- Nat Gertler (talk) 14:45, 15 December 2024 (UTC)
- These ideas (particularly the toggle button to blur/hide all images) can be suggested at m:Community Wishlist. Some1 (talk) 15:38, 15 December 2024 (UTC)
Class icons in categories
[edit]This is something that has frequently occurred to me as a potentially useful feature when browsing categories, but I have never quite gotten around to actually proposing it until now.
Basically, I'm thinking it could be very helpful to have content-assessment class icons appear next to article entries in categories. This should be helpful not only to readers, to guide them to the more complete entries, but also to editors, to alert them to articles in the category that are in need of work. Thoughts? Gatoclass (talk) 03:02, 7 December 2024 (UTC)
- If we go with this, I think there should be only 4 levels - Stub, Average (i.e. Start, C, or B), GA, & FA.
- There are significant differences between Start, C, and B, but there's no consistent effort to grade these articles correctly and consistently, so it might be better to lump them into one group. Especially if an article goes down in quality, almost nobody will bother to demote it from B to C. ypn^2 04:42, 8 December 2024 (UTC)
- Isn't that more of an argument for consolidation of the existing levels rather than reducing their number for one particular application?
- Other than that, I think I would have to agree that there are too many levels - the difference between Start and C class, for example, seems quite arbitrary, and I'm not sure of the usefulness of A class - but the lack of consistency within levels is certainly not confined to these lower levels, as GAs can vary enormously in quality and even FAs. But the project nonetheless finds the content assessment model to be useful, and I still think their usefulness would be enhanced by addition to categories (with, perhaps, an ability to opt in or out of the feature).
- I might also add that including content assessment class icons to categories would be a good way to draw more attention to them and encourage users to update them when appropriate. Gatoclass (talk) 14:56, 8 December 2024 (UTC)
- I believe anything visible in reader-facing namespaces needs to be more definitively accurate than in editor-facing namespaces. So I'm fine having all these levels on talk pages, but not on category pages, unless they're applied more rigorously.
- On the other hand, with FAs and GAs, although standards vary within a range, they do undergo a comprehensive, well-documented, and consistent process for promotion and demotion. So just like we have an icon at the top of those articles (and in the past, next to interwiki links), I could hear putting them in categories. [And it's usually pretty obvious whether something's a stub or not.] ypn^2 18:25, 8 December 2024 (UTC)
- Isn't the display of links Category pages entirely dependent on the Mediawiki software? We don't even have Short descriptions displayed, which would probably be considerably more useful.Any function that has to retrieve content from member articles (much less their talkpages) is likely to be somewhat computationally expensive. Someone with more technical knowledge may have better information. Folly Mox (talk) 18:01, 8 December 2024 (UTC)
- Yes, this will definitely require MediaWiki development, but probably not so complex. And I wonder why this will be more computationally expensive than scanning articles for [ [Category: ] ] tags in the first place. ypn^2 18:27, 8 December 2024 (UTC)
And I wonder why this will be more computationally expensive than scanning articles for [ [Category: ] ] tags in the first place
my understanding is that this is not what happens. When a category is added to or removed from an article, the software adds or removes that page as a record from a database, and that database is what is read when viewing the category page. Thryduulf (talk) 20:14, 8 December 2024 (UTC)
- Yes, this will definitely require MediaWiki development, but probably not so complex. And I wonder why this will be more computationally expensive than scanning articles for [ [Category: ] ] tags in the first place. ypn^2 18:27, 8 December 2024 (UTC)
- I think that in the short term, this could likely be implemented using a user script (displaying short descriptions would also be nice). Longer-term, if done via an extension, I suggest limiting the icons to GAs and FAs for readers without accounts, as other labels aren't currently accessible to them. (Whether this should change is a separate but useful discussion). — Frostly (talk) 23:06, 8 December 2024 (UTC)
- I'd settle for a user script. Who wants to write it? :) Gatoclass (talk) 23:57, 8 December 2024 (UTC)
- As an FYI for whoever decides to write it, Special:ApiHelp/query+pageassessments may be useful to you. Anomie⚔ 01:04, 9 December 2024 (UTC)
- @Gatoclass, the Wikipedia:Metadata gadget already exists. Go to Special:Preferences#mw-prefsection-gadgets-gadget-section-appearance and scroll about two-thirds of the way through that section.
- I strongly believe that ordinary readers don't care about this kind of inside baseball, but if you want it for yourself, then use the gadget or fork its script. Changing this old gadget from "adding text and color" to "displaying an icon" should be relatively simple. WhatamIdoing (talk) 23:43, 12 December 2024 (UTC)
- As an FYI for whoever decides to write it, Special:ApiHelp/query+pageassessments may be useful to you. Anomie⚔ 01:04, 9 December 2024 (UTC)
- I'd settle for a user script. Who wants to write it? :) Gatoclass (talk) 23:57, 8 December 2024 (UTC)
- I strongly oppose loading any default javascript solution that would cause hundreds of client-side queries every time a category page is opened. As far as making an upstream software request, there are multiple competing page quality metrics and schemes that would need to be reviewed. — xaosflux Talk 15:13, 18 December 2024 (UTC)
Cleaning up NA-class categories
[edit]We have a long-standing system of double classification of pages, by quality (stub, start, C, ...) and importance (top, high, ...). And then there are thousands of pages that don't need either of these; portals, redirects, categories, ... As a result most of these pages have a double or even triple categorization, e.g. Portal talk:American Civil War/This week in American Civil War history/38 is in Category:Portal-Class United States articles, Category:NA-importance United States articles, and Category:Portal-Class United States articles of NA-importance.
My suggestion would be to put those pages only in the "Class" category (in this case Category:Portal-Class United States articles), and only give that category a NA-rating. Doing this for all these subcats (File, Template, ...) would bring the at the moment 276,534 (!) pages in Category:NA-importance United States articles back to near-zero, only leaving the anomalies which probably need a different importance rating (and thus making it a useful cleanup category).
It is unclear why we have two systems (3 cat vs. 2 cat), the tags on Category talk:2nd millennium in South Carolina (without class or NA indication) have a different effect than the tags on e.g. Category talk:4 ft 6 in gauge railways in the United Kingdom, but my proposal is to make the behaviour the same, and in both cases to reduce it to the class category only (and make the classes themselve categorize as "NA importance"). This would only require an update in the templates/modules behind this, not on the pages directly, I think. Fram (talk) 15:15, 9 December 2024 (UTC)
- Are there any pages that don't have the default? e.g. are there any portals or Category talk: pages rated something other than N/A importance? If not then I can't see any downsides to the proposal as written. If there are exceptions, then as long as the revised behaviour allows for the default to be overwritten when desired again it would seem beneficial. Thryduulf (talk) 16:36, 9 December 2024 (UTC)
- As far as I know, there are no exceptions. And I believe that one can always override the default behaviour with a local parameter. @Tom.Reding: I guess you know these things better and/or knows who to contact for this. Fram (talk) 16:41, 9 December 2024 (UTC)
- Looking a bit further, there do seem to be exceptions, but I wonder why we would e.g. have redirects which are of high importance to a project (Category:Redirect-Class United States articles of High-importance). Certainly when one considers that in some cases, the targets have a lower importance than the redirects? E.g. Talk:List of Mississippi county name etymologies. Fram (talk) 16:46, 9 December 2024 (UTC)
- I was imagining high importance United States redirects to be things like USA but that isn't there and what is is a very motley collection. I only took a look at one, Talk:United States women. As far as I can make out the article was originally at this title but later moved to Women in the United States over a redirect. Both titles had independent talk pages that were neither swapped nor combined, each being rated high importance when they were the talk page of the article. It seems like a worthwhile exercise for the project to determine whether any of those redirects are actually (still?) high priority but that's independent of this proposal. Thryduulf (talk) 17:17, 9 December 2024 (UTC)
- Category:Custom importance masks of WikiProject banners (15) is where to look for projects that might use an importance other than NA for cats, or other deviations. ~ Tom.Reding (talk ⋅dgaf) 17:54, 9 December 2024 (UTC)
- Most projects don't use this double intersection (as can be seen by the amount of categories in Category:Articles by quality and importance, compared to Category:GA-Class articles). I personally feel that the bot updated page like User:WP 1.0 bot/Tables/Project/Television is enough here and requires less category maintenance (creating, moving, updating, etc.) for a system that is underused. Gonnym (talk) 17:41, 9 December 2024 (UTC)
- Support this, even if there might be a few exceptions, it will make them easier to spot and deal with rather than having large unsorted NA-importance categories. Chaotic Enby (talk · contribs) 18:04, 9 December 2024 (UTC)
- Strongly agree with this. It's bizarre having two different systems, as well as a pain in the ass sometimes. Ideally we should adopt a single consistent categorization system for importance/quality. – Closed Limelike Curves (talk) 22:56, 16 December 2024 (UTC)
Okay, does anyone know what should be changed to implement this? I presume this comes from Module:WikiProject banner, I'll inform the people there about this discussion. Fram (talk) 14:49, 13 December 2024 (UTC)
- So essentially what you are proposing is to delete Category:NA-importance articles and all its subcategories? I think it would be best to open a CfD for this, so that the full implications can be discussed and consensus assured. It is likely to have an effect on assessment tools, and tables such as User:WP 1.0 bot/Tables/Project/Africa would no longer add up to the expected number — Martin (MSGJ · talk) 22:13, 14 December 2024 (UTC)
- There was a CfD specifically for one, and the deletion of Category:Category-Class Comics articles of NA-importance doesn't seem to have broken anything so far. A CfD for the deletion of 1700+ pages seems impractical, an RfC would be better probably. Fram (talk) 08:52, 16 December 2024 (UTC)
- Well a CfD just got closed with 14,000 categories, so that is not a barrier. It is also the technically correct venue for such discussions. By the way, all of the quality/importance intersection categories check that the category exists before using it, so deleting them shouldn't break anything. — Martin (MSGJ · talk) 08:57, 16 December 2024 (UTC)
- And were all these cats tagged, or how was this handled? Fram (talk) 10:21, 16 December 2024 (UTC)
- Wikipedia:Categories for discussion/Log/2024 December 7#Category:Category-Class articles. HouseBlaster took care of listing each separate cateory on the working page. — Martin (MSGJ · talk) 10:43, 16 December 2024 (UTC)
- I have no idea what the "working page" is though. Fram (talk) 11:02, 16 December 2024 (UTC)
- Wikipedia:Categories for discussion/Log/2024 December 7#Category:Category-Class articles. HouseBlaster took care of listing each separate cateory on the working page. — Martin (MSGJ · talk) 10:43, 16 December 2024 (UTC)
- And were all these cats tagged, or how was this handled? Fram (talk) 10:21, 16 December 2024 (UTC)
- Well a CfD just got closed with 14,000 categories, so that is not a barrier. It is also the technically correct venue for such discussions. By the way, all of the quality/importance intersection categories check that the category exists before using it, so deleting them shouldn't break anything. — Martin (MSGJ · talk) 08:57, 16 December 2024 (UTC)
- There was a CfD specifically for one, and the deletion of Category:Category-Class Comics articles of NA-importance doesn't seem to have broken anything so far. A CfD for the deletion of 1700+ pages seems impractical, an RfC would be better probably. Fram (talk) 08:52, 16 December 2024 (UTC)
I'm going to have to oppose any more changes to class categories. Already changes are causing chaos across the system with the bots unable to process renamings and fixing redirects whilst Special:Wantedcategories is being overwhelmed by the side effects. Quite simply we must have no more changes that cannot be properly processed. Any proposal must have clear instructions posted before it is initiated, not some vague promise to fix a module later on. Timrollpickering (talk) 13:16, 16 December 2024 (UTC)
- Then I'm at an impasse. Module people tell me "start a CfD", you tell me "no CfD, first make changes at the module". No one wants the NA categories for these groups. What we can do is 1. RfC to formalize that they are unwanted, 2. Change module so they no longer get populated 3. Delete the empty cats caused by steps 1 and 2. Is that a workable plan for everybody? Fram (talk) 13:39, 16 December 2024 (UTC)
- I don't think @Timrollpickering was telling you to make the changes at the module first, rather to prepare the changes in advance so that the changes can be implemented as soon as the CfD reaches consensus. For example this might be achieved by having a detailed list of all the changes prepared and published in a format that can be fed to a bot. For a change of this volume though I do think a discussion as well advertised as an RFC is preferable to a CfD though. Thryduulf (talk) 14:43, 16 December 2024 (UTC)
- Got it in one. There are just too many problems at the moment because the modules are not being properly amended in time. We need to be firmer in requiring proponents to identify the how to change before the proposal goes live so others can enact it if necessary, not close the discussion, slap the category on the working page and let a mess pile up whilst no changes to the module are implemented. Timrollpickering (talk) 19:37, 16 December 2024 (UTC)
- Oh, I got it as well, but at the module talk page, I was told to first have a CfD (to determine consensus first I suppose, instead of writing the code without knowing if it will be implemented). As I probably lack the knowledge to make the correct module changes, I'm at an impasse. That's why I suggested an RfC instead of a CfD to determine the consensus for "deletion after the module has been changed", instead of a CfD which is more of the "delete it now" variety. No one here has really objected to the deletion per se, but I guess that a more formal discussion might be welcome. Fram (talk) 10:09, 17 December 2024 (UTC)
- Got it in one. There are just too many problems at the moment because the modules are not being properly amended in time. We need to be firmer in requiring proponents to identify the how to change before the proposal goes live so others can enact it if necessary, not close the discussion, slap the category on the working page and let a mess pile up whilst no changes to the module are implemented. Timrollpickering (talk) 19:37, 16 December 2024 (UTC)
- I don't think @Timrollpickering was telling you to make the changes at the module first, rather to prepare the changes in advance so that the changes can be implemented as soon as the CfD reaches consensus. For example this might be achieved by having a detailed list of all the changes prepared and published in a format that can be fed to a bot. For a change of this volume though I do think a discussion as well advertised as an RFC is preferable to a CfD though. Thryduulf (talk) 14:43, 16 December 2024 (UTC)
- Oppose on the grounds that I think the way we do it currently is fine. PARAKANYAA (talk) 05:33, 18 December 2024 (UTC)
- What's the benefit of having two or three categories for the same group of pages? We have multiple systems (with two or three cats, and apparently other ones as well), with no apparent reason to keep this around. As an example, we have Category:Category-Class film articles with more than 50,000 pages, e.g. Category talk:20th century in American cinema apparently. But when I go to that page, it isn't listed in that category, it is supposedly listed in Category:NA-Class film articles (which seems to be a nonsense category, we shouldn't have NA-class, only NA-importance). but that category doesn't contain that page. So now I have no idea what's going on or what any of this is trying to achieve. Fram (talk) 08:30, 18 December 2024 (UTC)
- Something changed recently. I think. But it is useful to know which NA pages are tagged with a project with a granularity beyond just "Not Article". It helps me do maintenance and find things that are tagged improperly, especially with categories. I do not care what happens to the importance ratings. PARAKANYAA (talk) 09:20, 18 December 2024 (UTC)
- What's the benefit of having two or three categories for the same group of pages? We have multiple systems (with two or three cats, and apparently other ones as well), with no apparent reason to keep this around. As an example, we have Category:Category-Class film articles with more than 50,000 pages, e.g. Category talk:20th century in American cinema apparently. But when I go to that page, it isn't listed in that category, it is supposedly listed in Category:NA-Class film articles (which seems to be a nonsense category, we shouldn't have NA-class, only NA-importance). but that category doesn't contain that page. So now I have no idea what's going on or what any of this is trying to achieve. Fram (talk) 08:30, 18 December 2024 (UTC)
Category:Current sports events
[edit]I would like to propose that sports articles should be left in the Category:Current sports events for 48 hours after these events have finished. I'm sure many Wikipedia sports fans (including me) open CAT:CSE first and then click on a sporting event in that list. And we would like to do so in the coming days after the event ends to see the final standings and results.
Currently, this category is being removed from articles too early, sometimes even before the event ends. Just like yesterday. AnishaShar, what do you say about that?
So I would like to ask you to consider my proposal. Or, if you have a better suggestion, please comment. Thanks, Maiō T. (talk) 16:25, 9 December 2024 (UTC)
- Thank you for bringing up this point. I agree that leaving articles in the Category:Current sports events for a short grace period after the event concludes—such as 48 hours—would benefit readers who want to catch up on the final standings and outcomes. AnishaShar (talk) 18:19, 9 December 2024 (UTC)
- Sounds reasonable on its face. Gatoclass (talk) 23:24, 9 December 2024 (UTC)
- How would this be policed though? Usually that category is populated by the {{current sport event}} template, which every user is going to want to remove immediately after it finishes. Lee Vilenski (talk • contribs) 19:51, 11 December 2024 (UTC)
- @Lee Vilenski: First of all, the Category:Current sports events has nothing to do with the Template:Current sport; articles are added to that category in the usual way.
- You ask how it would be policed. Simply, we will teach editors to do it that way – to leave an article in that category for another 48 hours. AnishaShar have already expressed their opinion above. WL Pro for life is also known for removing 'CAT:CSE's from articles. I think we could put some kind of notice in that category so other editors can notice it. We could set up a vote here. Maybe someone else will have a better idea. Maiō T. (talk) 20:25, 14 December 2024 (UTC)
- Would it not be more suitable for a "recently completed sports event" category. It's pretty inaccurate to say it's current when the event finished over a day ago. Lee Vilenski (talk • contribs) 21:03, 14 December 2024 (UTC)
Okay Lee, that's also a good idea. We have these two sports event categories:
- Category:Scheduled sports events
- Category:Current sports events
- Category:Recent sports events can be a suitable addition to those two. Edin75, you are also interested in categories and sporting events; what is your opinion? Maiō T. (talk) 18:14, 16 December 2024 (UTC)
- I don't have any objection to a Recent sports events category being added, but personally, if I want to see results of recent sports events, I would be more likely to go to Category:December 2024 sports events, which should include all recent events. Edin75 (talk) 23:30, 16 December 2024 (UTC)
- Did this get the go-ahead then? I see a comment has been added to the category, and my most recent edit was reverted when I removed the category after an event finished. I didn't see any further discussion after my last comment. Edin75 (talk) 09:37, 25 December 2024 (UTC)
User-generated conflict maps
[edit]In a number of articles we have (or had) user-generated conflict maps. I think the mains ones at the moment are Syrian civil war and Russian invasion of Ukraine. The war in Afghanistan had one until it was removed as poorly-sourced in early 2021. As you can see from a brief review of Talk:Syrian civil war the map has become quite controversial there too.
My personal position is that sourcing conflict maps entirely from reports of occupation by one side or another of individual towns at various times, typically from Twitter accounts of dubious reliability, to produce a map of the current situation in an entire country (which is the process described here), is a WP:SYNTH/WP:OR. I also don't see liveuamap.com as necessarily being a highly reliable source either since it basically is an WP:SPS/Wiki-style user-generated source, and when it was discussed at RSN editors there generally agreed with that. I can understand it if a reliable source produces a map that we can use, but that isn't what's happening here.
Part of the reason this flies under the radar on Wikipedia is it ultimately isn't information hosted on EN WP but instead on Commons, where reliable sourcing etc. is not a requirement. However, it is being used on Wikipedia to present information to users and therefore should fall within our PAGs.
I think these maps should be deprecated unless they can be shown to be sourced entirely to a reliable source, and not assembled out of individual reports including unreliable WP:SPS sources. FOARP (talk) 16:57, 11 December 2024 (UTC)
- A lot of the maps seem like they run into SYNTH issues because if they're based on single sources they're likely running into copyright issue as derivative works. I would agree though that if an image does not have clear sourcing it shouldn't be used as running into primary/synth issues. Der Wohltemperierte Fuchs talk 17:09, 11 December 2024 (UTC)
- Though simple information isn't copyrightable, if it's sufficiently visually similar I suppose that might constitute a copyvio. JayCubby 02:32, 13 December 2024 (UTC)
- I agree these violate OR and at least the spirit of NOTNEWS and should be deprecated. I remember during the Wagner rebellion we had to fix one that incorrectly depicted Wagner as controlling a swath of Russia. Levivich (talk) 05:47, 13 December 2024 (UTC)
- The Syrian map (right) seems quite respectable being based on the work of the Institute for the Study of War and having lots of thoughtful process and rules for updates. It is used on many pages and in many Wikipedias. There is therefore a considerable consensus for its use. Andrew🐉(talk) 11:33, 18 December 2024 (UTC)
- Oppose: First off, I'd like to state my bias as a bit of a map geek. I've followed the conflict maps closely for years.
- I think the premise of this question is flawed. Some maps may be poorly sourced, but that doesn't mean all of them are. The updates to the Syrian, Ukraine, and Burma conflicts maps are sourced to third parties. So that resolves the OR issue.
- The sources largely agree with each other, which makes SYNTH irrelevant. Occasionally one source may be ahead of another by a few hours (e.g., LiveUaMap vs. ISW), but they're almost entirely in lock step.
- I think this proposal throws out the baby with the bathwater. One bad map doesn't mean we stop using maps; it means we stop using bad maps.
- You may not like the fact that these sources sometimes use OSI (open-source intelligence). Unfortunately, that is the nature of conflict in a zone where the press isn't allowed. Any information you get from the AP or the US government is likely to rely on the same sources.
- Do they make mistakes? Probably; but so do all historical sources. And these maps have the advantage that the Commons community continuously reviews changes made by other users. Much in the same way that Wikipedia is often more accurate than historical encyclopedias, I believe crowdsourcing may make these maps more accurate than historical ones.
- I think deprecating these maps would leave the reader at a loss (pictures speak a 1,000 words and all that). Does it get a border crossing wrong here or there? Yes, but the knowledge is largely correct.
- It would be an absolute shame to lose access to this knowledge. Magog the Ogre (t • c) 22:59, 19 December 2024 (UTC)
- @Magog the Ogre WP:ITSUSEFUL is frowned upon as an argument for good reason. Beyond that: 1) the fact that these are based on fragmentary data is strangely not mentioned at all (Syrian civil war says 'Military situation as of December 18, 2024 at 2:00pm ET' which suggests that it's quite authoritative and should be trusted; the fact that it's based off the ISW is not disclosed.) 2) I'm not seeing where all the information is coming from the ISW. The ISW's map only covers territory, stuff like bridges, dams, "strategic hills" and the like are not present on the ISW map[1]. Where is that info coming from? Der Wohltemperierte Fuchs talk 23:10, 19 December 2024 (UTC)
- The Commons Syria map uses both the ISW and Liveuamap. The two are largely in agreement, with Liveuamap being more precise but using less reliable sources. If you have an issue with using Liveuamap as a source, fine, bring it up on the talk pages where it's used, or on the Commons talk page itself. But banning any any map of a conflict is throwing out the baby with the bathwater. The Ukraine map is largely based on ISW-verifiable information.
- With regards to actual locations like bridges, I'm against banning Commons users from augmenting maps with easily verifiable landmarks. That definition of SYN is broad to the point of meaningless, as it would apply to any user-generated content that uses more than one source. Magog the Ogre (t • c) 23:50, 20 December 2024 (UTC)
- @Magog the Ogre WP:ITSUSEFUL is frowned upon as an argument for good reason. Beyond that: 1) the fact that these are based on fragmentary data is strangely not mentioned at all (Syrian civil war says 'Military situation as of December 18, 2024 at 2:00pm ET' which suggests that it's quite authoritative and should be trusted; the fact that it's based off the ISW is not disclosed.) 2) I'm not seeing where all the information is coming from the ISW. The ISW's map only covers territory, stuff like bridges, dams, "strategic hills" and the like are not present on the ISW map[1]. Where is that info coming from? Der Wohltemperierte Fuchs talk 23:10, 19 December 2024 (UTC)
- Weak Oppose I've been updating the Ukraine map since May 2022, so I hope my input is helpful. While I agree that some of the sources currently being used to update these maps may be dubious in nature, that has not always been the case. In the past, particularly for the Syria map, these maps have been considered among the most accurate online due to their quality sourcing. It used to be that a source was required for each town if it was to be displayed on these maps, but more recently, people have just accepted taking sources like LivaUAMap and the ISW and copying them exactly. Personally, I think we should keep the maps but change how they are sourced. I think that going back to the old system of requiring a reliable source for each town would clear up most of the issues that you are referring to, though it would probably mean that the maps would be less detailed than they currently are now. Physeters✉ 07:23, 21 December 2024 (UTC)
- Oppose The campaign maps are one of our absolute best features. The Syrian campaign map in particular was very accurate for much of the war. Having a high quality SVG of an entire country like that is awesome, and there really isn't anything else like it out there, which is why it provides such value to our readers. I think we have to recognize of our course that they're not 100% accurate, due to the fog of war. I wouldn't mind if we created subpages about the maps? Like, with a list of sources and their dates, designed to be reader facing, so that our readers could verify the control of specific towns for themselves. But getting rid of the maps altogether is throwing out the baby with the bathwater. CaptainEek Edits Ho Cap'n!⚓ 23:33, 22 December 2024 (UTC)
Google Maps: Maps, Places and Routes
[edit]Google Maps have the following categories: Maps, Places and Routes
for example: https://www.google.com/maps/place/Sheats+Apartments/@34.0678041,-118.4494914,3a,75y,90t/data=!...........
most significant locations have a www.google.com/maps/place/___ URL
these should be acknowledged and used somehow, perhaps geohack
69.181.17.113 (talk) 00:22, 12 December 2024 (UTC)
- What is the proposal here? If its for the google maps article, that would be more suitable for the talk page. As I see it, your proposal is simply saying that google maps has an api and we should use it for... something. I could be missing something, though Mgjertson (talk) 08:20, 17 December 2024 (UTC)
- As I understand it, the IP is proposing embeds of google maps, which would be nice from a functionality standpoint (the embedded map is kinda-rather buggy), but given Google is an advertising company, isn't great from a privacy standpoint. JayCubby 16:25, 17 December 2024 (UTC)
- I think they're proposing the use of external links rather than embedding. jlwoodwa (talk) 18:16, 17 December 2024 (UTC)
- As I understand it, the IP is proposing embeds of google maps, which would be nice from a functionality standpoint (the embedded map is kinda-rather buggy), but given Google is an advertising company, isn't great from a privacy standpoint. JayCubby 16:25, 17 December 2024 (UTC)
Allowing page movers to enable two-factor authentication
[edit]I would like to propose that members of the page mover user group be granted the oathauth-enable
permission. This would allow them to use Special:OATH to enable two-factor authentication on their accounts.
Rationale (2FA for page movers)
[edit]The page mover guideline already obligates people in that group to have a strong password, and failing to follow proper account security processes is grounds for revocation of the right. This is because the group allows its members to (a) move pages along with up to 100 subpages, (b) override the title blacklist, and (c) have an increased rate limit for moving pages. In the hands of a vandal, these permissions could allow significant damage to be done very quickly, which is likely to be difficult to reverse.
Additionally, there is precedent for granting 2FA access to users with rights that could be extremely dangerous in the event of account compromise, for instance, template editors, importers, and transwiki importers have the ability to enable this access, as do most administrator-level permissions (sysop, checkuser, oversight, bureaucrat, steward, interface admin).
Discussion (2FA for page movers)
[edit]- Support as proposer. JJPMaster (she/they) 20:29, 12 December 2024 (UTC)
- Support (but if you really want 2FA you can just request permission to enable it on Meta) * Pppery * it has begun... 20:41, 12 December 2024 (UTC)
- For the record, I do have 2FA enabled. JJPMaster (she/they) 21:47, 12 December 2024 (UTC)
- Oops, that says you are member of "Two-factor authentication testers" (testers = good luck with that). Johnuniq (talk) 23:52, 14 December 2024 (UTC)
- A group name which is IMO seriously misleading - 2FA is not being tested, it's being actively used to protect accounts. * Pppery * it has begun... 23:53, 14 December 2024 (UTC)
- meta:Help:Two-factor authentication still says "currently in production testing with administrators (and users with admin-like permissions like interface editors), bureaucrats, checkusers, oversighters, stewards, edit filter managers and the OATH-testers global group." Hawkeye7 (discuss) 09:42, 15 December 2024 (UTC)
- A group name which is IMO seriously misleading - 2FA is not being tested, it's being actively used to protect accounts. * Pppery * it has begun... 23:53, 14 December 2024 (UTC)
- Oops, that says you are member of "Two-factor authentication testers" (testers = good luck with that). Johnuniq (talk) 23:52, 14 December 2024 (UTC)
- For the record, I do have 2FA enabled. JJPMaster (she/they) 21:47, 12 December 2024 (UTC)
- Support as a pagemover myself, given the potential risks and need for increased security. I haven't requested it yet as I wasn't sure I qualified and didn't want to bother the stewards, but having
oathauth-enable
by default would make the process a lot more practical. Chaotic Enby (talk · contribs) 22:30, 12 December 2024 (UTC)- Anyone is qualified - the filter for stewards granting 2FA is just "do you know what you're doing". * Pppery * it has begun... 22:46, 12 December 2024 (UTC)
- Question When's the last time a page mover has had their account compromised and used for pagemove vandalisn? Edit 14:35 UTC: I'm not doubting the nom, rather I'm curious and can't think of a better way to phrase things. JayCubby 02:30, 13 December 2024 (UTC)
- Why isn't everybody allowed to enable 2FA? I've never heard of any other website where users have to go request someone's (pro forma, rubber-stamp) permission if they want to use 2FA. And is it accurate that 2FA, after eight years, is still "experimental" and "in production testing"? I guess my overall first impression didn't inspire me with confidence in the reliability and maintenance. Adumbrativus (talk) 06:34, 14 December 2024 (UTC)
- Because the recovery process if you lose access to your device and recovery codes is still "contact WMF Trust and Safety", which doesn't scale. See also phab:T166622#4802579. Anomie⚔ 15:34, 14 December 2024 (UTC)
- We should probably consult with WMF T&S before we create more work for them on what they might view as very low-risk accounts. Courtesy ping @JSutherland (WMF). –Novem Linguae (talk) 16:55, 14 December 2024 (UTC)
- No update comment since 2020 doesn't fill me with hope. I like 2FA, but it needs to be developed into a usable solution for all. Lee Vilenski (talk • contribs) 00:09, 15 December 2024 (UTC)
- I ain't a technical person, but could a less secure version of 2fa be introduced, where an email is sent for any login on new devices? JayCubby 01:13, 15 December 2024 (UTC)
- Definitely. However email addresses also get detached from people, so that would require that people regularly reconfirm their contact information. —TheDJ (talk • contribs) 11:01, 18 December 2024 (UTC)
- I ain't a technical person, but could a less secure version of 2fa be introduced, where an email is sent for any login on new devices? JayCubby 01:13, 15 December 2024 (UTC)
- For TOTP (the 6-digit codes), it's not quite as bad as when it was written, as the implementation has been fixed over time. I haven't heard nearly as many instances of backup scratch codes not working these days compared to when it was new. The WebAuthn (physical security keys, Windows Hello, Apple Face ID, etc) implementation works fine on private wikis but I wouldn't recommend using it for CentralAuth, especially with the upcoming SUL3 migration. There's some hope it'll work better afterward, but will still require some development effort. As far as I'm aware, WMF is not currently planning to work on the 2FA implmentation. As far as risk for page mover accounts goes, they're at a moderate risk. Page move vandalism, while annoying to revert, is reversible and is usually pretty loud (actions of compromised accounts can be detected and stopped easily). The increased ratelimit is the largest concern, but compared to something like account creator (which has noratelimit) it's not too bad. I'm more concerned about new page reviewer. There probably isn't a ton of harm to enabling 2FA for these groups, but there isn't a particularly compelling need either. AntiCompositeNumber (talk) 12:47, 19 December 2024 (UTC)
- Because the recovery process if you lose access to your device and recovery codes is still "contact WMF Trust and Safety", which doesn't scale. See also phab:T166622#4802579. Anomie⚔ 15:34, 14 December 2024 (UTC)
- Support per nom. PMV is a high-trust role (suppressredirect is the ability to make a blue link turn red), and thus this makes sense. As a side note, I have changed this to bulleted discussion; # is used when we have separate sections for support and oppose. HouseBlaster (talk • he/they) 07:19, 14 December 2024 (UTC)
- Oppose As a pagemover myself, I find pagemover is an extremely useful and do not wish to lose it. It is nowhere near the same class as template editor. You can already ask the stewards for 2FA although I would recommend creating a separate account for the purpose. After all these years, 2FA remains experimental, buggy and cumbersome. Incompatible with the Microsoft Authenticator app on my iphone. Hawkeye7 (discuss) 23:59, 14 December 2024 (UTC)
- The proposal (as I read it) isn't "you must have 2FA", rather "you have the option to add it". Lee Vilenski (talk • contribs) 00:06, 15 December 2024 (UTC)
- @Hawkeye7, Lee Vilenski is correct. This would merely provide page movers with the option to enable it. JJPMaster (she/they) 00:28, 15 December 2024 (UTC)
- Understood, but I do not want it associated with an administrator-level permission, which would mean I am not permitted to use it, as I am not an admin. Hawkeye7 (discuss) 09:44, 15 December 2024 (UTC)
- It's not really that. It would be an opt-in to allow users (in the group) to put 2FA on their account - at their own digression.
- The main reasons why 2FA is currently out to admins and the like is because they are more likely to be targeted for compromising and are also more experienced. The 2FA flag doesn't require any admin skills/tools and is only incedentally linked. Lee Vilenski (talk • contribs) 12:58, 15 December 2024 (UTC)
- Wait, so why is 2FA not an option for everyone already? – Closed Limelike Curves (talk) 01:15, 18 December 2024 (UTC)
- @Closed Limelike Curves the MediaWiki's 2FA implementation is complex, and the WMF's processes to support people who get locked out of their account aren't able to handle a large volume of requests (developers can let those who can prove they are the owner of the account back in). My understanding is that the current processes cannot be efficiently scaled up either, as it requires 1:1 attention from a developer, so unless and until new processes have been designed, tested and implemented 2FA is intended to be restricted to those who understand how to use it correctly and understand the risks of getting locked out. Thryduulf (talk) 09:36, 18 December 2024 (UTC)
- Wait, so why is 2FA not an option for everyone already? – Closed Limelike Curves (talk) 01:15, 18 December 2024 (UTC)
- Understood, but I do not want it associated with an administrator-level permission, which would mean I am not permitted to use it, as I am not an admin. Hawkeye7 (discuss) 09:44, 15 December 2024 (UTC)
- @Hawkeye7, Lee Vilenski is correct. This would merely provide page movers with the option to enable it. JJPMaster (she/they) 00:28, 15 December 2024 (UTC)
- The proposal (as I read it) isn't "you must have 2FA", rather "you have the option to add it". Lee Vilenski (talk • contribs) 00:06, 15 December 2024 (UTC)
- It probably won't make a huge difference because those who really desire 2FA can already request the permission to enable it for their account, and because no page mover will be required to do so. However, there will be page movers who wouldn't request a global permission for 2FA yet would enable it in their preferences if it was a simple option. And these page movers might benefit from 2FA even more than those who already care very strongly about the security of their account. ~ ToBeFree (talk) 03:18, 15 December 2024 (UTC)
- Support and I can't think of any argument against something not only opt-in but already able to be opted into. Gnomingstuff (talk) 08:09, 15 December 2024 (UTC)
- Oppose this is a low value permission, not needed. If an individual PMV really wants to opt-in, they can already do so over at meta - no need to build custom configuration for this locally. — xaosflux Talk 15:06, 18 December 2024 (UTC)
- Support; IMO all users should have the option to add 2FA. Stifle (talk) 10:26, 19 December 2024 (UTC)
- Support All users should be able to opt in to 2FA. Lack of a scalable workflow for users locked out of their accounts is going to be addressed by WMF only if enough people are using 2FA (and getting locked out?) to warrant its inclusion in the product roadmap. – SD0001 (talk) 14:01, 19 December 2024 (UTC)
- That (and to @Stifle above) sounds like an argument to do just that - get support put in place and enable this globally, not to piecemeal it in tiny batches for discretionary groups on a single project (this custom configuration would support about 3/10ths of one percent of our active editors). To the point of this RFC, why do you think adding this for this specific tiny group is a good idea? — xaosflux Talk 15:40, 19 December 2024 (UTC)
- FWIW, I tried to turn this on for anyone on meta-wiki, and the RFC failed (meta:Meta:Requests for comment/Enable 2FA on meta for all users). — xaosflux Talk 21:21, 19 December 2024 (UTC)
- Exactly. Rolling it out in small batches helps build the case for a bigger rollout in the future. – SD0001 (talk) 05:24, 20 December 2024 (UTC)
- FWIW, I tried to turn this on for anyone on meta-wiki, and the RFC failed (meta:Meta:Requests for comment/Enable 2FA on meta for all users). — xaosflux Talk 21:21, 19 December 2024 (UTC)
- I'm pretty sure that 2FA is already available to anyone. You just have to want it enough to either request it "for testing purposes" or to go to testwiki and request that you made an admin there, which will automatically give you access. See H:ACCESS2FA. WhatamIdoing (talk) 23:41, 21 December 2024 (UTC)
- We shouldn't have to jump through borderline manipulative and social-engineering hoops to get basic security functionality. — SMcCandlish ☏ ¢ 😼 04:40, 22 December 2024 (UTC)
- That (and to @Stifle above) sounds like an argument to do just that - get support put in place and enable this globally, not to piecemeal it in tiny batches for discretionary groups on a single project (this custom configuration would support about 3/10ths of one percent of our active editors). To the point of this RFC, why do you think adding this for this specific tiny group is a good idea? — xaosflux Talk 15:40, 19 December 2024 (UTC)
- Oppose. It sounds like account recovery when 2FA is enabled involves Trust and Safety. I don't think page movers' account security is important enough to justify increasing the burden on them. —Compassionate727 (T·C) 14:10, 21 December 2024 (UTC)
- Losing access to the account is less common nowadays since most 2FA apps, including Google Authenticator, have implemented cloud syncing so that even if you lose your phone, you can still access the codes from another device. – SD0001 (talk) 14:40, 21 December 2024 (UTC)
- But this isn't about Google Authenticator. Johnuniq (talk) 02:58, 22 December 2024 (UTC)
- Google Authenticator is a 2FA app, which at least till some point used to be the most popular one. – SD0001 (talk) 07:07, 22 December 2024 (UTC)
- But (I believe), it is not available for use at Wikipedia. Johnuniq (talk) 07:27, 22 December 2024 (UTC)
- That's not true. You can use any TOTP authenticator app for MediaWiki 2FA. I currently use Ente Auth, having moved on from Authy recently, and from Google Authenticator a few years back. In case you're thinking of SMS-based 2FA, it has become a thing of the past and is not supported by MediaWiki either because it's insecure (attackers have ways to trick your network provider to send them your texts). – SD0001 (talk) 09:19, 22 December 2024 (UTC)
- But (I believe), it is not available for use at Wikipedia. Johnuniq (talk) 07:27, 22 December 2024 (UTC)
- Google Authenticator is a 2FA app, which at least till some point used to be the most popular one. – SD0001 (talk) 07:07, 22 December 2024 (UTC)
- But this isn't about Google Authenticator. Johnuniq (talk) 02:58, 22 December 2024 (UTC)
- Losing access to the account is less common nowadays since most 2FA apps, including Google Authenticator, have implemented cloud syncing so that even if you lose your phone, you can still access the codes from another device. – SD0001 (talk) 14:40, 21 December 2024 (UTC)
- Support. Even aside from the fact that, in 2024+, everyone should be able to turn on 2FA .... Well, absolutely certainly should everyone who has an advanced bit, with potential for havoc in the wrong hands, be able to use 2FA here. That also includes template-editor, edit-filter-manager, file-mover, account-creator (and supersets like event-coordinator), checkuser (which is not strictly tied to adminship), and probably also mass-message-sender, perhaps a couple of the others, too. Some of us old hands have several of these bits and are almost as much risk as an admin when it comes to loss of account control. — SMcCandlish ☏ ¢ 😼 04:40, 22 December 2024 (UTC)
- Take a look at Special:ListGroupRights - much of what you mentioned is already in place, because these are groups that could use it and are widespread groups used on most WMF projects. (Unlike extendedmover). — xaosflux Talk 17:22, 22 December 2024 (UTC)
- Re
That also includes [...], file-mover, account-creator (and supersets like event-coordinator), [...] and probably mass-message-sender
. How can in any way would file mover, account creator, event coordinator and mass message sender user groups be considered privileged, and therefore have theoathauth-enable
userright? ToadetteEdit (talk) 17:37, 24 December 2024 (UTC)
- Comment: It is really not usual for 2FA to be available to a user group that is not defined as privileged in the WMF files. By default, all user groups defined at CommonSettings.php (iirc) that are considered to be privileged have the
oathauth-enable
right. Also, the account security practices mentioned in wp:PGM are also mentioned at wp:New pages patrol/Reviewers, despite not being discussed at all. Shouldn't it be fair to have theextendedmover
userright be defined as privileged. ToadetteEdit (talk) 08:33, 23 December 2024 (UTC) - Support. Like SMcCandlish, I'd prefer that anyone, and particularly any editor with advanced perms, be allowed to turn on 2FA if they want (this is already an option on some social media platforms). But this is a good start, too.Since this is a proposal to allow page movers to opt in to 2FA, rather than a proposal to mandate 2FA for page movers, I see no downside in doing this. – Epicgenius (talk) 17:02, 23 December 2024 (UTC)
- Support this opt-in for PMs and the broader idea of everyone having it by default. Forgive me if this sounds blunt, but is the responsibility and accountability of protecting your account lie on you and not WMF. Yes, they can assist in recovery, but the burden should not lie on them. ~/Bunnypranav:<ping> 17:13, 23 December 2024 (UTC)
Photographs by Peter Klashorst
[edit]Back in 2023 I unsuccessfully nominated a group of nude photographs by Peter Klashorst for deletion on Commons. I was concerned that the people depicted might not have been of age or consented to publication. Klashorst described himself as a "painting sex-tourist"[2] because he would travel to third-world countries to have sex with women in brothels, and also paint pictures of them[3][4]. On his Flickr account, he posted various nude photographs of African and Asian women, some of which appear to have been taken without the subjects' knowledge. Over the years, other Commons contributors have raised concerns about the Klashorst photographs (e.g. [5][6][7]).
I noticed recently that several of the Klashorst images had disappeared from Commons but the deletions hadn't been logged. I believe this happens when the WMF takes an office action to remove files. I don't know for sure whether that's the case, or why only a small number of the photographs were removed this way.
My proposal is that we stop using nude or explicit photographs by Klashorst in all namespaces of the English Wikipedia. This would affect about thirty pages, including high-traffic anatomy articles such as Buttocks and Vulva. gnu57 18:29, 16 December 2024 (UTC)
- @Genericusername57: This seems as if it's essentially a request for a community sanction, and thus probably belongs better on the administrators' noticeboard. Please tell me if I am mistaken. JJPMaster (she/they) 23:12, 16 December 2024 (UTC)
- @JJPMaster: I am fine with moving the discussion elsewhere, if you think it more suitable. gnu57 02:16, 17 December 2024 (UTC)
- @Genericusername57: I disagree with JJPMaster in that this seems to be the right venue, but I also disagree with your proposal. Klashorst might have been a sleazeball, yes, but the images at the two listed articles do not show recognizable subjects, nor do they resemble “creepshots”, nor is there evidence they’re underage. If you object to his images you can nominate them on Commons. Your ‘23 mass nomination failed because it was extremely indiscriminate (i.e. it included a self portrait of the artist). Dronebogus (talk) 00:30, 17 December 2024 (UTC)
- @Dronebogus: According to User:Lar, Commons users repeatedly contacted Klashorst, asking him to provide proof of age and consent for his models, but he did not do so. I am planning on renominating the photographs on Commons, and I think removing them from enwiki first will help avoid spurious c:COM:INUSE arguments. The self-portrait you are referring to also included another naked person. gnu57 02:16, 17 December 2024 (UTC)
- @Genericusername57: replacing the ones at vulva and buttocks wouldn’t be difficult; the first article arguably violates WP:ETHNICGALLERY and conflicts with human penis only showing a single image anyway. However I think it’s best if you went to those actual articles and discussed removing them. I don’t know what other pages use his images besides his own article but they should be dealt with separately. If you want to discuss banning his photos from Wikimedia in general that’s best discussed at Commons. In all cases my personal view is that regardless of whether they actually run afoul of any laws purging creepy, exploitative pornography of third-world women is no great loss. Dronebogus (talk) 01:16, 18 December 2024 (UTC)
- I have to confess that I do not remember the details of the attempts to clarify things with Peter. If this turns out to be something upon which this decision might turn, I will try to do more research. But I’m afraid it’s lost in the mists of time. ++Lar: t/c 01:25, 24 December 2024 (UTC)
- Note also that further attempts to clarify matters directly with Peter will not be possible, as he is now deceased. ++Lar: t/c 15:45, 24 December 2024 (UTC)
- I have to confess that I do not remember the details of the attempts to clarify things with Peter. If this turns out to be something upon which this decision might turn, I will try to do more research. But I’m afraid it’s lost in the mists of time. ++Lar: t/c 01:25, 24 December 2024 (UTC)
- @Genericusername57: replacing the ones at vulva and buttocks wouldn’t be difficult; the first article arguably violates WP:ETHNICGALLERY and conflicts with human penis only showing a single image anyway. However I think it’s best if you went to those actual articles and discussed removing them. I don’t know what other pages use his images besides his own article but they should be dealt with separately. If you want to discuss banning his photos from Wikimedia in general that’s best discussed at Commons. In all cases my personal view is that regardless of whether they actually run afoul of any laws purging creepy, exploitative pornography of third-world women is no great loss. Dronebogus (talk) 01:16, 18 December 2024 (UTC)
- @Dronebogus: According to User:Lar, Commons users repeatedly contacted Klashorst, asking him to provide proof of age and consent for his models, but he did not do so. I am planning on renominating the photographs on Commons, and I think removing them from enwiki first will help avoid spurious c:COM:INUSE arguments. The self-portrait you are referring to also included another naked person. gnu57 02:16, 17 December 2024 (UTC)
- Several issues here. First, if the files are illegal, that's a matter for Commons as they should be deleted. On the enwiki side of things, if there's doubt about legality, Commons has plenty of other photos that can be used instead. Just replace the photos. The second issue is exploitation. Commons does have commons:COM:DIGNITY which could apply, and depending on the country in which the photo was taken there may be stricter laws for publication vs. capture, but it's a hard sell to delete things on Commons if it seems like the person in the photo consented (with or without payment). The problem with removing files that may be tainted by exploitation is we'd presumably have to remove basically all images of all people who were imprisoned, enslaved, colonized, or vulnerable at the time of the photo/painting/drawing. It becomes a balance where we consider the context of the image (the specifics of when/where/how it was taken), whether the subject is still alive (probably relevant here), and encyclopedic importance. I'd be inclined to agree with some above that there aren't many photos here that couldn't be replaced with something else from Commons, but I don't think you'll find support for a formalized ban. Here's a question: what happens when you just try to replace them. As long as the photo you're replacing it with is high quality and just as relevant to the article, I don't think you'd face many challenges? — Rhododendrites talk \\ 16:20, 24 December 2024 (UTC)
Move the last edited notice from the bottom of the page to somewhere that's easier to find
[edit]Currently, if you want to check when the last page edit was, you have to look at the edit history or scroll all the way to the bottom of the page and look for it near the licensing info. I propose moving it under the view history and watch buttons, across from the standard "This article is from Wikipedia" disclaimer. Non-technical users may be put off by the behind-the-scenes nature of the page or simply not know of its existence. The Mobile site handles this quiet gracefully in my opinion. While it is still at the bottom of the page, it isn't found near Licensing talk and is a noticeable portion of the page Mgjertson (talk) 08:32, 17 December 2024 (UTC)
- Editors can already enable mw:XTools § PageInfo gadget, which provides this information (and more) below the article title. I don't think non-editors would find it useful enough to be worth the space. jlwoodwa (talk) 18:12, 17 December 2024 (UTC)
I wished Wikipedia supported wallpapers in pages...
[edit]It would be even more awesome if we could change the wallpaper of pages in Wikipedia. But the fonts' colors could change to adapt to the wallpaper. The button for that might look like this: Change wallpaper Gnu779 (talk) 11:02, 21 December 2024 (UTC)
- I think we already tried this. It was called Myspace ;) —TheDJ (talk • contribs) 11:51, 21 December 2024 (UTC)
- See Help:User style for information on creating your own stylesheet. isaacl (talk) 18:03, 21 December 2024 (UTC)
- @Gnu779: You have successfully nerd-sniped me, so I’m gonna work on a user script for this. JJPMaster (she/they) 22:54, 26 December 2024 (UTC)
Change page titles/names using "LGBTQ" to "LGBTQ+"
[edit]Please see my reasoning at Wikipedia talk:WikiProject LGBTQ+ studies#LGBTQ to LGBTQ+ (and please post your thoughts there). It was proposed that I use this page to escalate this matter, as seen on the linked talk page. Helper201 (talk) 20:42, 23 December 2024 (UTC)
- Snowclose - As mentioned in that discussion, there was a decision on this topic not long ago based on ngram data which lead to the LGBT -> LGBTQ rename. It hasn't been long enough for consensus to substantially change, and the ngram dataset hasn't been updated since that previous proposal. BugGhost 🦗👻 10:00, 26 December 2024 (UTC)
- Agree with BugGhost; I also personally think this topic area (LGBTetc. acronyms) can lean uncomfortably close to WP:GREATWRONGS and WP:TOOSOON. People who by contemporary westernized standards would not be considered “hetero-typical” or “cis-typical” have always existed; the current terminology around them is extremely young. Dronebogus (talk) 14:05, 26 December 2024 (UTC)