Help Me Understand Comment Spam

So, I get a lot of comment spam here, probably a couple of orders of magnitude more than I get real comments (sigh). The vast majority of this gets blocked by built-in filters, so none of the stuff pitching medically implausible treatments for whatever makes it to a point where I have to see it. There's one new category of junk comment, though, that slips through the filters and requires human moderation (i.e., I have to approve or reject it), and I find this utterly baffling.

The comments are sorta-kinda relevant to the post that they're sent to, albeit with dubious English, and they don't contain any obvious pitch for anything. They also don't include a URL. What they do have, somewhere, is an eight-or-nine-digit number that looks fairly random. So, for example, a spam comment sent to the recent post about searching for exotic physics might read something like:

I agree that the Standard Model of particle physics is very successful theory, but what other theories might be there to explain unexplained phenomena? I think for example alien abduction and crop circles are very interesting things, but do not know can the Standard Model explain these? i18675309

Sometimes the gibberish string of digits is appended to the name field rather than the text of the post. Geographically, most of these originate in South Africa (to the point where I'm basically just flagging anything from Pretoria as spam), but not always. The email addresses are usually legit-looking gmail accounts; that is, not obviously computer-generated alphanumeric strings, but things that might plausibly relate to an actual human name, but not necessarily the name given in the "Name" field.

I find these very irritating, not just because they make me question everything that comes in with less than perfect English, but because I can't figure out what the angle is. They're not selling anything in the text, and they don't leave a URL so they're not getting any Google boost from having the comments go through. The entire purpose seems to be to get those strings of numbers into blog comments, and I can't figure out what the hell those numbers are.

So, does anybody know what the deal is with this? What are those numbers? What is the purpose of trying to get them into my comment section?

More like this

Chad-

Those numbers are irrelevant to the posts, innocuous at best, and a potential indication of some kind of cybercrime at worst. At minimum you should probably remove the numbers from the comments. I see similar numbers on comments in other blogs on Scienceblogs, so I'd also suggest that you contact others who have had this problem and let them know to do likewise.

Innocuous: Possibly some kind of tracking number or identifier sent by the sender's broadband provider or the sender's device or browser, appended to comments in a manner analogous to the pesky advertising sigs that are sent by certain devices e.g. "sent from my iPad." Possibly an artifact of the Scienceblogs infrastructure, as an accidental response to something for example the sender's IP address.

Cyber-crime: Possibly a side-effect of a device that's infected with malware, or something that's deliberately spread by malware for some purpose known only to the author or user of the malware. These could also be identifiers for test comments by individuals who later return to post other types of spam under the same or different name, or for persons getting paid to post spammy comments to test the boundaries of site moderation or automated spam rejection. If any of this is the case, then deleting the numbers denies the cybercriminal whatever benefit they might have provided.

Other: Suggestions for lottery numbers, by people who don't understand that lotteries are a math-ignorance tax?;-)

In any case, stripping the numbers off the posts won't do any harm unless the numbers are related to cybercrime, in which case that will make life more difficult for the cybercriminals, which is always good.

Those numbers in the end identifies that person that is commenting on the blog. It is an Assignment.

Googling the number leads (uniquely if you ignore hits direct and indirect on this page) to a trip advisor report....

If that's the intention, it's /very/ indirect!

By Steve Glover (not verified) on 15 Apr 2015 #permalink

I've seen such comments on some of the other ScienceBlogs. The explanation given in #2 was the explanation the commenters offered when Martin Rundkvist asked about it.

As for the language: South Africa was part of the British Empire, so kids learn the Queen's English in school. It's a second language for many students (other languages spoken in South Africa include Afrikaans and Xhosa), but they should be learning it at a young enough age to become fluent. You might get an occasional nonstandard sentence construction, as in the example you gave, but it should make sense.

By Eric Lund (not verified) on 15 Apr 2015 #permalink

Unfortunately Eric "the Queen's english" is supposed to be taught, but in theory it is not nearly up to standard. The South African education system is dismal. It was rated last out of 148 countries and is said to have the worst maths and science. I have had first hand experience helping at a poorer government school and there are at least 30 in tiny classrooms. Majority of them can't speak English and so the teachers usually teach in other languages. what if children struggle or fail? They'll push them through anyway hoping that they succeed in life, because there are too many coming in the next year. And so the vicious cycle runs. It is very sad. So when we all get to university (that is who the 'numbers' are) we have to take a language subject that makes a feeble attempt to improve English, by giving us bloggong assignments etc.

#2 is indeed correct. We have been given an assaigmnent to learn how to blog, our aim is to blog on different blogs with relevant information and to show insight

Chad, it seems like this came up once before, where you had identified that students were using your blog comment section as part of an assignment. Or am I misremembering that?

I don't know if your example is really representative, but if it is, you sort of have to make a decision about whether you want to provide a forum for people talking about alien abductions and crop circles. I would suggest that you shouldn't.

It might be useful to try to get in touch with the teacher giving this as an assignment and see if there's a more productive way for their students to contribute.

Perhaps the teachers should instruct the students to end their posts with a sentence, something like "This comment is made for a grade 11 Physics assignment at Pretoria high school; my student identifier is 2746263."

By Rosie Redfield (not verified) on 15 Apr 2015 #permalink

They're aliens trying to tell you something ;)

I get these too. My guess would be that they're trying to get more hits for that number just by spreading it all over the place? I haven't looked at google search algorithm updates for a while, but maybe it increases the trip-advisor (mentioned above) ranking.

The other thing that I have been wondering is whether it gives credibility to the account used if you don't mark it as spam. But then, why the number?

PS: Regarding the credibility hypothesis, I originally thought that the numbers are a bot-mistake left over from trying to cope with the captcha on my blog. But you don't have one here, so doesn't make sense.

Assignment sounds too good not to be true, but some spam filters will detect multiple identical messages and reject them, the workaround for which is to add some unique text to each sent to circumvent them.

Chad-
The "spam" from Pretoria is an assignment given to first year students in the faculty of Agricultural and natural sciences at the university (cited in my location). The scope of the assignment is for the students to read the post and comment. These are first year students. Most of the comments are ussually only governed by the little comprehension they have of the subject matter, derived only from the posts that they read sites such as scienceblogs.com. Scienceblogs was cited in class as a reliable and credible discussion forum on scientific topics, hence the flocking of students to this forum(where they found your post)

The students are meant to be "blogging" and starting a discussion stemming from their posts( in this case, their comments). I guess they ( myself included:))have succeeded in creating discussion if this post is anything to go by.

Please note that in South Africa we use "British" English in academia. The spellings used for some words and the use of some words are very different from "American" English.This however, is no excuse for bad grammar. [example: Tyre in South Africa = the apparatus that a bicycle runs as opposed to Tire in America]
I hope these comments do not erode the scientific credibility of your work.

By Maleho Sadiki (not verified) on 15 Apr 2015 #permalink

With reference to #12

The numbers at the end of the comments are the respective students' student numbers(student identification numbers). These are used by the lecturer to identify and attribute the comment to the student.
#15211224

By Maleho Sadiki (not verified) on 15 Apr 2015 #permalink

"We have been given an assaigmnent to learn how to blog, our aim is to blog on different blogs with relevant information and to show insight"

Er, so that's a big fat FAIL then, seeing as commenting on blog posts isn't blogging.

By Craig Thomas (not verified) on 15 Apr 2015 #permalink

@Chad -- this is across multiple blogs here; I've seen lots of them on Ethan's "Starts with a Bang." The answer from #2 and others seems to be correct. All of the postings claim a Pretoria or South Africa address, and every once in a while you'll see a double post from someone, where the second one is just the statement "Student number XXXXXXXXX." In other words, they forgot to include it in the original posting.

The problem I see with it is that they are just drive-by comments. Quite often, it is obvious that the students never actually read the blog, but just comment generically, or based on the title. If they do ask a question, and some of us answer, there's never any followup (which probably means they don't know there even was an answer).

By Michael Kelsey (not verified) on 15 Apr 2015 #permalink

The amount of spam you describe has been shown to strongly correlate with the blog's number of hits. It's apparently extensive.

By Bruce Fowler (not verified) on 16 Apr 2015 #permalink

I get a lot of these now. I'm glad to see it is an assignment and not aliens. Well, actually aliens wold be cool....

Anyway, as a one time member of the Faculty of the University of Pretoria, I'm glad to be visited by students from there!

It is a bit of a firehose, though.

I have been given an assignment to learn how to blog and part of the assignment is to comment. So this is me commenting.

I'm not sure if my comment falls under spam though.

Anyway, I hope this comment qualifies to get me marks !

u15010202

By Priya Govender (not verified) on 16 Apr 2015 #permalink

And Priya provides a rare example of this type of comment that is actually relevant, apt, and informative.
Thanks, Priya.

By Craig Thomas (not verified) on 16 Apr 2015 #permalink

This is fascinating. Blogs such a this one have been recommended by our lecturers as credibile for an assignemt to introduce as to blogging. I could however not imagine the evident storm that has been caused.

Priya's comment is an explanation of your concern. I do believe that you can rest assured that these are not aliens or an act of cyber crime. The number are merely a means of identification for mark allocation.

For me the experience has been a first and very enlightening.

15298117

"Well, actually aliens wold be cool…." ,it is would.. Mr.They-make-me-question-everything-that-comes-in-with- less-than-perfect-English.. Love the blog though #13096712 :)

we have been given an assignment to to comment on science blog and we need to include our student numbers at the end 12288269.

By Brian Mahlangu (not verified) on 17 Apr 2015 #permalink

We were given assignment in which we have to leave comments on blogs

u15359213

By Precious Moloi (not verified) on 17 Apr 2015 #permalink

This is a science blog and whats up with the numbers really?

Actually Craig Thomas, I did re-post another comment with the correct grammar straight after I posted the one with terrible grammar but it was conveniently deleted, and the post that makes me look dumb as a doorknob was left up in the comments. We are COMMENTING on blogs. Happy now?

By Amanda Ngcobo (not verified) on 19 Apr 2015 #permalink

I get a lot of duplicate or near-duplicate comments, because if the system doesn't recognize some combination of name, email, and IP address, it automatically holds the comment for moderation. Some people interpret that as a glitch, and re-post essentially the same thing almost immediately. I think the record is four times in a row.

I tend to delete the reposts when I see them; this may have led me to delete a comment that contained a substantial correction at some point, for which I apologize. I was not trying to make anybody look dumb, just to keep the comments from being overly repetitive.

As much as it is a science blog, the comments, as poor as they may be, may have some sort of importance in furthering findings as some of the comments I have read do lead to some sort of contemplation.

Also, It is easier learning the queens tongue if it's the language you grow up speaking. Even when learnt at a young age it is not the same if at home you are speaking Zulu or Xhosa with everyone around you. It's unfortunate that we have 11 official languages and they all play some role in our life, not only english
14258537