Management

What The Web Knows About You

Having written a few times about Internet security and privacy I was recently asked how one finds out what the web knows about any person. That was one of those “good question” type enquiries.

To state the obvious, the Internet giants are almost unimaginably big: Google is not only the world’s largest search engine, it’s one of the top three email providers, a social network and owner of the Blogger platform and the world’s largest video site, YouTube. Facebook has the social contacts, messages, wallposts and photos of more than 750 million people.

Given that such information could be used covertly or overtly to sell us stuff, accessed by government or law enforcement bodies and lately by prospective employers or, theoretically, at least picked up by hackers or others, it’s not unreasonable to wonder exactly how much the Internet giants know about us.

The simple step is to search for your name although the results may be very different depending on which country is providing the primary search engine and data base. That is because most countries differ in their privacy laws which Google, for instance, assures us that they obey and thereby control data base access accordingly. By this act we may well be defeated in our searches.

Thankfully, Google isn’t totally unhelpful. It has two tools that help show the information it holds on you. The first, Google Dashboard, has run for about three years and gathers information from almost all of Google’s services in one place. Another feature, the “account activity report”, was launched recently and shows Google’s information on your logins in the past month, including countries, browsers, platforms and how much these services have been used.

Running these tools on my work email account is disconcerting but not too much so. The Dashboard can see I’m a member of a few internal Google groups and have a blogger account used to collaborate with some researchers on web optimisation data.

Data showing your Gmail account shows and lists the number of contacts that perhaps one should review, as does a list of the number Google docs that have been opened. The site also lists my most recent sent and received emails.
A more disconcerting feature may be the chat history logging all your conversations. Google chat is a handy way to collaborate in an organisation, where these days the norm seems to prefer to talk electronically rather than in the flesh. Just remember that everything is recorded including your little gossips. It may be wise to delete some of these logs.

There are also entries for your Google+, YouTube, your nick names used for different accounts, YouTube user figures and even your maps search history is present. Google also holds information on your login IPs and other anonymous non-logged-in data but doesn’t (yet) make this available.

Google insists the tracking for its display adverts, after all it is the market leader in online advertising, doesn’t draw from user data but comes instead from cookies, files that anonymously monitor the sites you visit. Fortunately this information is a selection of Google’s best guesses based on your history files and can produce some amusing results. It is an attempt to “type” you and thereby produce more relevant search results.

By using Ghostery which I wrote about a few months ago I am now defeating some of Google’s tracking attempts.

Being a smartphone user it does however store all the information from both my phone and tablet as part of my backup regime for these devices. I cannot complain about that. I could use a number of other free cloud services for that but prefer not to scatter my data all over the web irrespective of their security features.

Another hoarder of information is of course Facebook. Facebook lets you download a history of the data you’ve put on the site. This is fairly limited in that it doesn’t show many classes of information; wallposts on other people’s profiles, others’ photos of you, and more. That said, there is a lot contained in the two releases you can get.

You get the archive through Facebook account settings. Then find and click “Download a copy of your Facebook data”. This includes lots of information that will be unsurprising; your photos, wall and notes and maybe a few bits you wouldn’t expect: I was surprised to see every Facebook event I’d ever been invited to neatly listed within. But this topic was covered by me recently so I won’t reprise it here.

There is at least one more search engine worth listing: pipl.com. That is not a typo but rather a strange abbreviation of the word “people”. Worth a look because, to my surprise, they appear to have scanned some ancient documents which belong to me and were retrieved from an unexpected archive which others would almost certainly be unaware of.

Of course there are other resources listing more of my preferences and doings that then make me beg the question: is the age of HAL, the super intelligent computer from 2001, approaching?

The data bases held by iTunes or Amazon and the like is the underlying value of these organisations. And the more we shop online the more we spread our details around.

An important first step in controlling what people can find about you on the web is knowing what’s published about you online. Your online identity is determined not only by what you post, but also by what others post about you, be it a mention in a blog post, a photo tag or a reply to a public status update. When someone searches for your name on a search engine like Google, the results that appear are probably a combination of information you’ve posted and information published by others.

Google offers their tool, “Me on the Web”, which makes it easier to monitor your identity online. It helps you set up Google Alerts, so you receive notifications when you are mentioned on websites or in news stories and it automatically suggests some search terms you may want to keep an eye on.

“Me on the Web” also provides links to resources offering information on how to control what third-party information is posted about you on the web. These include tips like reaching out to the webmaster of a site to ask for the content to be taken down, or publishing additional information on your own to help make less relevant websites appear farther down in search results.

Curiously, one can get different results depending on which browser one uses. I have no valid explanation for this. Search engine cachés also store copies of past files and may take some time to remove these.

If all this is important to you there are paid for services which do the tracking of your cyberspace data. If your information was once on a web page there is a fair chance the Wayback Machine also has a record of it.

The simplest solution is to be conscious of what information you provide at any time although admittedly that is becoming a serious hurdle. The simple answer is that the web knows more about you than you think.
It’s weird, isn’t it? All these assertions, made by a mess of magnetic dust and a collection of binary 1s and 0s.

Related Articles

0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments
Back to top button
WP Tumblr Auto Publish Powered By : XYZScripts.com
AccomNews
0
Would love your thoughts, please comment.x
()
x