OSDN | Our Network | DevChannel | Newsletters | Advertise | Shop     X 
Welcome to Slashdot News Microsoft Games The Courts Movies
 faq
 code
 awards
 journals
 subscribe
 older stuff
 rob's page
 preferences
 submit story
 advertising
 supporters
 past polls
 topics
 about
 bugs
 jobs
 hof

Sections
apache
Dec 2

apple
Jan 6
(4 recent)

askslashdot
Jan 7
(17 recent)

books
Jan 6
(1 recent)

bsd
Jan 6
(2 recent)

developers
Jan 7
(14 recent)

features
Dec 23

interviews
Dec 23

radio
Jun 29

science
Jan 7
(15 recent)

yro
Jan 7
(7 recent)

Data Mining Briefly Explained | Log in/Create an Account | Top | 119 comments | Search Discussion
Threshold:
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
in russia (Score:-1, Offtopic)
by Anonymous Coward on Saturday January 04, @03:49PM (#5015496)
Data mines you.

Do you mind?
[ Reply to This ]
Michael Sims (Score:-1, Offtopic)
by Anonymous Coward on Saturday January 04, @03:51PM (#5015506)
is an idiot.
[ Reply to This ]
    Re:Michael Sims (Score:-1, Offtopic)
    by Anonymous Coward on Saturday January 04, @03:56PM (#5015539)
    Oh come on, Timothy - no need to mod this down so quickly - You have to work with the communist fuckwit, you should appreciate this little-known fact being brought to light.
    [ Reply to This | Parent ]
    Re:Michael Sims (Score:-1, Offtopic)
    by Anonymous Coward on Saturday January 04, @10:56PM (#5017489)
    stop slandering people or you will die a slander
    [ Reply to This | Parent ]
Uhhh... (Score:2)
by Grip3n (470031) on Saturday January 04, @03:52PM (#5015515)
(http://www.headstunt.com/)
Note the prominent sticker ;)

Doesn't he mean "snicker"? ;)
[ Reply to This ]
    Mine This California: +1, Unpatriotic (Score:0)
    by Anonymous Coward on Saturday January 04, @04:04PM (#5015573)
    #!/usr/bin/perl -w
    # 531-byte qrpff-fast, Keith Winstein and Marc Horowitz
    # MPEG 2 PS VOB file on stdin -> descrambled output on stdout
    # arguments: title key bytes in least to most-significant order
    $_='while(read+STDIN,$_,2048){$a=29;$b=73;$c=142;$ t=255;@t=map{$_%16or$t^=$c^=(
    $m=(11,10,116,100,11,122,20,100)[$_/16%8])$t^=(72, @z=(64,72,$a^=12*($_%16
    -2?0:$m&17)),$b^=$_%64?12:0,@z)[$_%8]}(16..271);if ((@a=unx"C*",$_)[20]&48){$h
    =5;$_=unxb24,join"",@b=map{xB8,unxb8,chr($_^$a[--$ h+84])}@ARGV;s/...$/1$&/;$
    d=unxV,xb25,$_;$e=256|(ord$b[4])>8^($f=$t&($d>>12^ $d>>4^
    $d^$d/8))>8^($t&($g=($q=$e>>14&7^$e)^$q*8^$q>=8)+= $f+(~$g&$t))for@a[128..$#a]}print+x"C*",@a}';s/x/p ack+/g;eval
    [ Reply to This | Parent ]
    No.. (Score:1, Informative)
    by Anonymous Coward on Saturday January 04, @04:21PM (#5015654)
    There is a redhat sticker in the top-left corner of the picture.
    [ Reply to This | Parent ]
    Tony Soprano in the mines (Score:1)
    by nrobert (605941) on Saturday January 04, @05:50PM (#5016023)
    This guy in the photo looks like Tony Soprano. Maybe the Mob uses RedHat Linux for data mining.

    I missed the episode with T in the server room.

    "Hey, Jackie, whatta these computahs for?"
    [ Reply to This | Parent ]
The Real Key is People.... (Score:4, Insightful)
by airrage (514164) on Saturday January 04, @03:52PM (#5015516)
(http://slashdot.org/~airrage/journal/15458 | Last Journal: Tuesday October 29, @09:53AM)
I think every major corporation has some sort of data-mining, and I find that there is a gap between the data (even scrubbed) and the person who needs to make the decisions. Also, the article suggests, that CRM is a subset of data-mining. In reality, it's the other way around, or completely unrelated, or both, unless I read that sentence wrong.

Chao
[ Reply to This ]
Michael Sims (Score:-1, Offtopic)
by Anonymous Coward on Saturday January 04, @03:52PM (#5015518)
is a total moron.
[ Reply to This ]
.a (Score:0, Troll)
by unterderbrucke (628741) <unterderbrucke@dygo.com> on Saturday January 04, @03:55PM (#5015529)
(http://www.pquinn.com/fries/ | Last Journal: Friday January 03, @04:47PM)
Can't we get it over with and just call "data miners" Big Brother and
[ Reply to This ]
you'd be amazed... (Score:4, Funny)
by inode_buddha (576844) on Saturday January 04, @03:56PM (#5015533)
(Last Journal: Tuesday November 26, @06:11PM)
at how powerful data mining tecniques can be. Why, just today I have recieved 3 more "Nigerian" mails, an offer to increase my bust size (I'm a guy), and an excellent credit report from 5 different, unheard-of companies...

Of course, the local supermarket cannot accept my personal check for groceries without their "discount card", never mind that it was *their* database admins who lost my account after a few weeks...

(er, yeah right, and my driver's licence and birth certificate aren't worth as much as their card ??)

Ggrrrrrrr......
[ Reply to This ]
    Re:you'd be amazed... (Score:1)
    by brodin (200847) on Saturday January 04, @07:27PM (#5016464)
    >an offer to increase my bust size (I'm a guy),
    Yeah, but you're a slashdot reader so you probably have a man-bust. I know I do (!@#$%! New Year's Resolution)...
    [ Reply to This | Parent ]
    Re:you'd be amazed... (Score:0)
    by Anonymous Coward on Sunday January 05, @12:14AM (#5017867)
    Actually, with the card they can do two things.

    1. Devolp a history of "bad" checks etc.
    2. Get a whole lot more time to check your history.
    3. Target their most profitable customers.

        That of course doesn't mean they do this. I sadly was a consultant who helped a company devolp one of these systems.
        For a while the stores most profitable customers (and 20% of the customers generate the bulk of profits) were getting targeted promitions.
          And the stores were targeting the needs of those customers via the data. Managers would call some customers who where the most profitable if they left to see if what had happened, and we tried to listen much more to the sugestions of those customers.
        However, that all got pulled within a year once another consultant came in. Now it is just a personal harazment and privacy price gouging system. And yes I am ashamed for birthing this.
    [ Reply to This | Parent ]
    Re:you'd be amazed... (Score:2)
    by geekoid (135745) <notities@yahoo.com> on Sunday January 05, @03:28AM (#5018623)
    (http://slashdot.org/ | Last Journal: Thursday February 21, @04:37PM)
    " an offer to increase my bust size (I'm a guy)"
    Then I take it your wife is getting the emails to increase the size of her penis!

    thank you, I'm here all week!
    [ Reply to This | Parent ]
Scoring 4 points (Score:-1, Offtopic)
by Anonymous Coward on Saturday January 04, @03:56PM (#5015538)
"Dude, you've got to get him out of your basement," said Dwayne.
"I've tried already," I said in a decidedly hushed voice looking back across the road at my parents house.
"My parents think he's better than me - he's putting some kind of wierd technical shit on my dad's computer.
It's fucking hopeless - they like him man."
Sally eyeballed between Dwayne and myself, obviously waiting to see which one of us was going to act first after winning this little debate.
That girl looked really hot in her bleached fucked up pigtails. I knew I had to do something to impress her.
"Just wait till he starts smelling again." Said Dwayne with the smug look of a victor in his eyes as he folded his arms.
"He's gonna start smelling again, and your parents are gonna whiff the goddamn nerd in your basement and think you have some pretty fucking weird friends. If he finds our weed and decides he likes that too - how much shit do you think you'll be in then. Huh?"
Dwayne nodded up and down Gangster style over his crossed arms.
Sally could tell he'd found a pretty tight strangle hold with that one.

Dwayne had won and I found myself acting. With one quick look over my shoulder and the thought of our stash of weed in mind I knew I had to do somethin'. Sally giggled as I started back for the basement again.
Pop's red Buick by the side of the road was a reminder of what was at stake. No goddamn Unix hippy was getting me in deep trouble with the folks like the joyride did last year.
"Get tha fuck outta the basement," was all I could think as I hustled across the lawn.
I looked in the sunken window and sure enough the glow of my dad's computer screen was clear enough in there. Holding me hostage or somethin'.
"Hey, hey, hey, Walter man," I called out with as much nicey shit as I could when I entered.
"Watcha doin?" I said as I neared his hallowed place at my dad's Walmart electronic piece of shite.
Walter looked up, that same look of Jesus Christ in technical Nirvana on his face again.
"Oh, I'm installing Slackware on your father's, er father's computer," he said as he bowed his bearded face again. How a kid of 17 could have that much hair on his face was really a strange thing.
Walter didn't look like he was in total turtle mode yet, maybe it was time to ask the big question.
"Wanna go drive - for a big Mac and coke with me, Dwayne and Sally. Tomato sauce with everything. It's starting to get late y'know?"
I kept my attention on him, whilst sitting back on the urge to throw my dad's computer against the wall.
"The thing about Linux - and well most desktop versions of Unix is that it takes a lot of um, work to set things up just so," said Walter as he beginning doing some technical shit with the fucking mouse.
I could hear Dwayne and Sally on the steps just outside. It was time for a time-out.

"He looks like he's found a home there" said Dwayne with a chuckle as I poked my head out the basement door.
"Fuck man." I exclaimed.
"He's tighter than Mr. Goober with a set of leathers. How the fuck am I gonna get him outta the house?"
Sally was really beginning to get with the giggles now.
"Dude, maybe you should attach a keyboard or somethin' to a fishing rod - dangle it outside the fucking window."
Sally was really hitting the high notes with her donkey assed laughter by now.
This night was starting to get pretty fucking lame.

EHM
[ Reply to This ]
Prominent sticker (Score:1, Funny)
by Anonymous Coward on Saturday January 04, @03:58PM (#5015545)
Yup, on a Dell from probably 1998-1999. Most of the other Dells in the photo look like they are of the same vintage.

Here's an example of the Microsoft Tax at work. This company most likely paid for Windows licenses on those machines even though they aren't using Windows.
[ Reply to This ]
Data Mining Briefly Explained (Score:4, Funny)
by hdparm (575302) on Saturday January 04, @03:58PM (#5015546)
(http://nzoss.org.nz/)
Briefly? This would be briefly:
  • 1. Collect data
  • 2. Do some mining
  • 3. ???
  • 4. Profit!
  • [ Reply to This ]
      The data gnomes are stealing my data! (Score:2)
      by SHEENmaster (581283) <sheenmaster@NosPam.flame.dnsart.com> on Saturday January 04, @04:01PM (#5015560)
      (http://flame.dnsart.com/ | Last Journal: Friday December 27, @04:50AM)
      Why doesn't anyone else see them!?
      [ Reply to This | Parent ]
      *sigh* (Score:3, Funny)
      by Chester K (145560) on Saturday January 04, @04:11PM (#5015610)
      (http://www.evercrest.com/)
      Ok let's get this out of our system now:

       
      Imagine a beowulf cluster of these things!....mining...data... yeah.

      In Soviet Russia, data mines YOU!

      It's official, Data Mining is DEAD. You don't have to be Kreskin to figure it out.

      Hey! I just found this site all about data mining here [goatse.cx]!!!!!

      Come on, really, is this News for Nerds or Stuff That Matters?

      You could probably use data mining to determine how many hot grits Natalie Portman actually eats.
       


      Alright. That should do it. Carry on with the discussion.
      [ Reply to This | Parent ]
      Re:Data Mining Briefly Explained (Score:2)
      by Lucas Membrane (524640) on Saturday January 04, @11:36PM (#5017667)
      You have hit the nail on the head. The ??? is the problem. The link or leap between knowledge and action is the hard part. Data mining can 'identify' 'profitable' and 'unprofitable' customers, but it can't tell you if your expense and profit allocations are right or if you should want to 'get rid' of 'unprofitable' customers or should want to try to turn them into profitable customers.

      The classic data mining result is diapers and beer. People who buy beer at convenience stores are also likely to buy diapers. Great. Given that bit of intelligence, do we:

      1. Put diapers and beer in close proximity so that people who buy diapers can easily pick up beer and vice versa, or
      2. Put diapers and beer at opposite ends of the store so that people who buy both diapers and beer must travel through the store and have a chance to buy everything else?
      The data seldom tell you what to do. Taking the data too seriously leads to treating customers like numbers, predictable statistical entities to be manipulated for profit's sake. This is not healthy for most businesses. Most of the important things that the data tell you, you could learn better by simply listening to customers respectfully.
      [ Reply to This | Parent ]
      Re:Data Mining Briefly Explained (Score:2)
      by Exantrius (43176) on Sunday January 05, @01:21AM (#5018152)
      (http://slashdot.org/)
      Actually, step three could be explicated as:
      3. Sell derivative information to people who want it, i.e. the people you *DON'T* want to have it.

      This includes, as others said, life insurance companies teaming up with grocery stores to find out what you eat, thus raising rates for people who eat "bad" stuff.

      Or phone spam companies buying info from phone companies-- Consumer A contacts consumer B, and A bought our stuff, therefore you should call B.

      Or, perhaps radio stations selling the numbers of people who request songs to the Wherehouse, so the Wherehouse can call you and say that you can buy the cd.

      Or, maybe the police decide to track where you go by reading license plates off of each of the cameras that they have up to detect speeders or light runners.

      Just some thoughts. This isn't a joke-- They know exactly how to get money from mining-- It depends on what data you have to who you can sell it to. Noone buys data for no reason-- And the only two reasons to buy data is to target for selling other stuff, or to "find people who don't want to be found"-- Whether it be to find terrorists, criminals, or theoretically people that make x hundred thousand/million a year, so that they can rob you.

      Of course, most of this stuff happens every day, and noone realizes. /ex.
      [ Reply to This | Parent ]
        Re:Data Mining Briefly Explained (Score:1)
        by Ed Random (27877) on Sunday January 05, @07:30AM (#5019114)
        (http://slashdot.org/ | Last Journal: Saturday July 06, @01:45PM)
        Or, maybe the police decide to track where you go by reading license plates off of each of the cameras that they have up to detect speeders or light runners.

        In fact, we have a licenseplate-reading system like this in .nl

        Video cameras record your license plate when you pass a portal, then record it again when you pass the next portal, say after 1 km. The images are stored and processed electronically.

        Your average speed is calculated and you're fined if you were speeding.

        Some argue that this system is fairer than using speedtrap cameras that record only 'an incident', not 'your general behaviour'.

        Others argue that "traject-controle" as the system is called here is a clear invasion of privacy (since they necessarily need to keep a record of your license plate during the 1km you're driving).

        The same system can be used to check for people without valid insurance, who 'forgot' the mandatory APK car checkup or those who neglected to pay their road taxes.

        The possibilities are endless... In other words, where willl this end?
        [ Reply to This | Parent ]
          Re:Data Mining Briefly Explained-cause-effect. (Score:0)
          by Anonymous Coward on Sunday January 05, @08:22AM (#5019263)
          "The possibilities are endless... In other words, where willl this end?"

          When people stop:
          Driving without insurance.
          Forgetting their timely APK car checkups.
          Forgetting to pay their road taxes.

          In other words. The few have spoiled it for the many, and the many stayed silent while the few did it. Welcome to the world that silence built.

          [ Reply to This | Parent ]
    Well.. so? (Score:5, Interesting)
    by metlin (258108) on Saturday January 04, @03:58PM (#5015547)
    (http://www.metlin.org/ | Last Journal: Wednesday February 06, @10:49AM)
    Interesting article, but this is something that has been happening and will continue to.

    Technology being put to use to seek out enemies of the state for the world governments is nothing new.

    Atleast it is a good thing that companies are making good money in the process. Your privacy? That was lost long ago.

    It was only a matter of time before this happened. Atleast be glad that we've not yet reached the stage where they'd bother having your entire genome sequence to create solutions and replacements for you :-)

    Perhaps the author of the article has just read Cryptonomicon or something.

    Get over it, companies will track you, governments will monitor it. And there will be people who will beat both, and people who will be susceptible to both. Unfortunate, but hey, paranoia does not help either.

    And oh, first post?
    [ Reply to This ]
      Re:Well.. so? (Score:0)
      by Anonymous Coward on Saturday January 04, @06:01PM (#5016065)
      Get over it? You must be French.
      [ Reply to This | Parent ]
      Re:Well.. so? (Score:3)
      by symbolic (11752) on Saturday January 04, @06:42PM (#5016257)
      Atleast it is a good thing that companies are making good money in the process. Your privacy? That was lost long ago.

      Oh, the irony.

      They call themselves patriotic, and yet they're supplying the very means that are slowly turning the U.S. into a police state. Sorry, but I seriously doubt that this is what the U.S. founders had in mind, and it's certainly not the reason that U.S. war veterans both risked and sacrificed their lives. Patriots aren't sheep that blindly follow the government, they are the ones who fight to maintain the fundamental (constitutional) precepts upon which the United States were built.
      [ Reply to This | Parent ]
    Reminds me of... (Score:5, Interesting)
    by gpinzone (531794) on Saturday January 04, @03:58PM (#5015548)
    (http://slashdot.org/)
    ...how the Bayesian spam filters operate (on a much smaller scale). They find predictors of "spam" like these guys find predictors of "terrorists."

    If the false positives of this system finding terrorists are as low as the ones that identify spam, is it really unreasonable to consider that probable cause for an investigation? At least, until the 0.000001% slips by and causes a lawsuit for wrongful arrest.
    [ Reply to This ]
      Re:Reminds me of... (Score:2, Interesting)
      by Anonymous Coward on Saturday January 04, @06:06PM (#5016083)
      With a spam filter, the penalty for false positive is perhaps a lost sale or an annoyed friend/coworker.

      With a terrorist classification filter, the penalty for a false positive could cost some innocent person days/weeks in prison and thousands of dollars in lost wages and legal fees. And thats assuming they are a US citizen. A non-citizen could be held indefinitely complely destroying any career they might have.
      [ Reply to This | Parent ]
        Re:Reminds me of... (Score:3, Interesting)
        by gpinzone (531794) on Saturday January 04, @08:35PM (#5016823)
        (http://slashdot.org/)
        Yes, but remember that the current methods aren't much better. I mean, right now there's lots of complaints about how the USA is racially profiling Middle Eastern men. Whether or not this profiling is justified could be based on a report of such a filter.

        The issue isn't whether or not we should use data mining to profile individuals or groups. Profilling will occur no matter what. What these methods do are help find parameters that more accurately identify candidates rather than just assume all Middle Easterners are automatically guilty until proven otherwise.
        [ Reply to This | Parent ]
      Re:Reminds me of... (Score:0)
      by Anonymous Coward on Sunday January 05, @06:31AM (#5018980)
      At least, until the 0.000001% slips by and causes a lawsuit for wrongful arrest.

      How do you launch a lawsuit when you're in an Army "detention camp" like the 550 or so "suspected terrorists" stuck down near Cuba?

      A US Judge said they didn't fall under US juristicion because they weren't on Mainland US soil, despite being on a US Army compound. Three English Judges overturned her decision... yet they stay confined (2 x 15 minutes excerise/week, for example)

      You can start counting how long before you guys have NO rights left anymore.

      Oh, and Fuck America.

       
      [ Reply to This | Parent ]
      Re:Reminds me of... (Score:0)
      by Anonymous Coward on Monday January 06, @02:59AM (#5024018)
      Data mining can be quite different. Bayesian methods used in the spam filters are supervised, which means that you show it examples of spam vs non-spam data and the system will learn the tell the difference between the two. It is "supervised" because you act as the teacher.

      Data mining methods can be unsupervised, which means no teacher exists. These methods learn to spot correlations in the data. Eg a supermarket data mining system may find that people who buy milk often buy oranges too. The supermarket relies on the data mining system to discover interesting info like this that it didn't know before. It will then use this to some advantage. Eg it could place milk and oranges next to each other to make it convenient for customers. Or it could intentionally put them far apart in the attempt to get customers to buy other items as well.
      [ Reply to This | Parent ]
    datamine yourself (Score:-1)
    by anonymous coword (615639) on Saturday January 04, @03:59PM (#5015550)
    (http://www.google.com/search?hl=en&lr=&ie=ISO-8859-1&q=%22anonymous+coword%22 | Last Journal: Friday January 03, @05:10PM)
    $ strings /dev/mem | grep "goatse.cx"

    Because you KNOW you visted it today!
    [ Reply to This ]
    to what end? (Score:2, Interesting)
    by loveandpeace (520766) on Saturday January 04, @03:59PM (#5015551)
    (http://www.silvercloak.com/loveandpeace/ | Last Journal: Friday December 27, @10:11AM)
    the more i read about data mining, the more it seems to provide a conectinvity and interaction leap, a step we are really due, in a technological sense. when the internet was new and all (shortly after Al Gore invented it), there was much talk of how Big Brother would swoop in and turn us into ones and zeros, monitor our every move, and control us through the new portal. that hasn't happened yet (though Ashcroft is trying). doese it seem that data mining is more harmful (making us all into terrorsts for buying fireworks and seeing born on the fourth of july in the same day) than good (allowing better prediction of supply and demand to lower costs and raise productivity)?
    [ Reply to This ]
    profiteering? (Score:5, Interesting)
    by SHEENmaster (581283) <sheenmaster@NosPam.flame.dnsart.com> on Saturday January 04, @03:59PM (#5015553)
    (http://flame.dnsart.com/ | Last Journal: Friday December 27, @04:50AM)
    Today, however, companies that excel in connecting the data dots are finding a lifeline in a customer whose IT ineptitude is matched only by its means: the U.S. government, which will spend $53 billion on information technology this year. The Federal Government's inability to share and analyze information became clear in the months after the 9/11 attacks.

    While I want argue against the governments inability to do anything but waste money, I do think that these "anti-terrorism" dealies are going too far. We know that they are spending $53 billion on information technology. When they spend it on a hammer or a toilet seat I know that something is getting done, but "information technology" makes me suspicious.

    Granted my opinion is largely a result of window flags selling in excess of twenty dollars and not hearing the results of such spending. In fact, I haven't heard of a single terrorist act averted since 9/11. It couldn't hurt to inform us when the spending pays off; could it?

    Is this information actually getting results, or is it just profiteering of the corporations that we so love to slander and libel?
    [ Reply to This ]
    Wow! What an eye-opener! (Score:1)
    by long_john_stewart_mi (549153) on Saturday January 04, @03:59PM (#5015555)
    And here I thought 'data miners' were seven really short geeks, holed up in a server closet with some hot chick that's hiding from her evil step-mother. Well, you learn something new every day! =)
    [ Reply to This ]
    Not that it helps (Score:2)
    by Alien54 (180860) on Saturday January 04, @04:00PM (#5015559)
    (http://radiofreenation.net/)
    Noting all of the ways certain monopolies have acted illegally has not helped in getting appropriate penalties for them in court.

    data is useless by itself unless it can be used appropriately.

    sort of like the list on conservative site NewsMax that finds that the vast majority of truly corrupt politicians in the past year were democrats [newsmax.com]. What a coincidence!

    What are the odds of finding out more things like this, like at the office of Total information Awareness? Or the Transport Security Agencies list of people who cannot fly [interventionmag.com]

    [ Reply to This ]
      You guys wanted information to be free. (Score:0)
      by Anonymous Coward on Saturday January 04, @04:13PM (#5015623)
      Well you're getting EXACTLY what you want. Don't cry and complain, data is data. To complain is to be a hypocrite. After all everything should be Open Source, eh? The moral: beware of what you ask for, you may just get it.
      [ Reply to This | Parent ]
    Michael Sims business plan (Score:-1, Offtopic)
    by Anonymous Coward on Saturday January 04, @04:03PM (#5015568)
    1. Make production illegal.
    2. Make profit illegal.
    3. Make capitalism illegal.
    4. ???
    5. Profit!!! ( the rules are different for Sims, of course )

    [ Reply to This ]
    Print Link (Score:4, Informative)
    by VargrX (104404) on Saturday January 04, @04:04PM (#5015572)
    (Last Journal: Thursday October 17, @06:24PM)
    dunno 'bout any one else, but I don't care for all the ads...
    Print Link [time.com]

    [ Reply to This ]
    Already used in mineral exploration (Score:4, Informative)
    by core plexus (599119) on Saturday January 04, @04:06PM (#5015578)
    (http://xnewswire.com/ | Last Journal: Monday January 06, @04:40PM)
    We've been using data mining in mineral exploration for quite some time now, and it really helps given the tremendous volums of data generated from modern geophysical, geochemical, and geological exploration.

    In related news: Seeking Sperm, Not Sex, Online [xnewswire.com]

    [ Reply to This ]
    Before You Jeer... (Score:3, Informative)
    by robbyjo (315601) on Saturday January 04, @04:06PM (#5015580)
    (http://slashdot.org/ | Last Journal: Sunday December 29, @11:12PM)

    You may want to read this book [aaai.org] and see it yourself whether data mining would make a breakthrough in the future.

    [ Reply to This ]
      Re:Before You Jeer... (Score:2, Interesting)
      by arasinen (22038) on Saturday January 04, @05:17PM (#5015875)
      Another good book that explains the basics of data mining is Principles of Data Mining by Hand et al.

      It is perhaps not the most simple book around, but it covers a lot of important issues. Furthermore it doesn't ignore the role of computer science, as two of the authors have a CS background.

      You won't find explicit instructions about how to build your own Google, but it surely does wonders for your insight.
      [ Reply to This | Parent ]
    Data mining plans (Score:0)
    by Anonymous Coward on Saturday January 04, @04:06PM (#5015583)
    1. Collect data
    2. ???
    3. Profit
    [ Reply to This ]
    OLD NEWS! (Score:0)
    by Anonymous Coward on Saturday January 04, @04:06PM (#5015585)
    If you read the title, you would see that it was dated 2002-12-23! Thats so last year. Oh well, at least its not a dupe!
    [ Reply to This ]
    Data mining for consumers? (Score:1, Interesting)
    by Anonymous Coward on Saturday January 04, @04:07PM (#5015592)
    "Throughout the '90s, data mining spread from one industry to the next, enabling companies to know more about customers' needs and to zero in on the characteristics that distinguish the customers they want from those they do not. A credit-card company using a system designed by Teradata, a division of NCR, found that customers who fill out applications in pencil rather than pen are more likely to default. A major hotel chain discovered that guests who opted for X-rated flicks spent more money and were less likely to make demands on the hotel staff, according to privacy consultant Larry Ponemon. These low-maintenance customers were rewarded with special frequent-traveler promotions. Victoria's Secret stopped uniformly stocking its stores once MicroStrategy showed that the chain sold 20 times as many size-32 bras in New York City as in other cities and that in Miami ivory was 10 times as popular as black. Aspect Communications, based in San Jose, Calif., sells a program that identifies callers by purchase history. The bigger the spender, the quicker the call gets picked up. So if you think your call is being answered in the order in which it was received, think again."

    Couldn't the consumer use such information to get a better deal? Also of course there's the "abuse" aspects for the businesses, amd governments that use this.
    [ Reply to This ]
      Good excuse for porn! (Score:1)
      by MadAnthony02 (626886) on Saturday January 04, @07:45PM (#5016574)
      (http://www.madanthony.net/)

      A major hotel chain discovered that guests who opted for X-rated flicks spent more money and were less likely to make demands on the hotel staff, according to privacy consultant Larry Ponemon. These low-maintenance customers were rewarded with special frequent-traveler promotions.

      Cool. Next time I go on a trip I can order some in room porn and justify it because I'll get better deals in the future!

      [ Reply to This | Parent ]
    *shaking head* (Score:1)
    by quikgrit (638508) on Saturday January 04, @04:11PM (#5015614)
    After 9/11, many tech companies saw opportunities for both patriotism and profit. Oracle offered to donate the software to create a federal identity database.


    Well, I suppose it's nice to know that the handbasket we're going to hell in is at least free.
    [ Reply to This ]
    Makes me think of Bowling For Columbine (Score:2, Interesting)
    by flopsy mopsalon (635863) on Saturday January 04, @04:15PM (#5015629)
    I couldn't help noticing the Time.com article made reference to crime and terrorism, particularly the September 11 WTC/Pentagon attacks (which happened over a year ago), and to the recent Washington Sniper killings (which ended months ago), in spite of the fact that this article would have been jst as fascinating if they had simply used the business examples as illustration.

    In the movie 'Bowling For Columbine' Michael Moore speculates that one of the root causes of gun violence in the US is the type of fearmongering the US media engages in in an effort to keep their sales/ratings up.

    It looks like Time.com's gratuitous exploitation of US fears of crime and terrorism might be an example of this.

    [ Reply to This ]
      Re:Makes me think of Bowling For Columbine (Score:2)
      by BWJones (18351) on Saturday January 04, @08:35PM (#5016824)
      (Last Journal: Thursday January 02, @05:41PM)
      I couldn't help noticing the Time.com article made reference to crime and terrorism, ....in spite of the fact that this article would have been jst as fascinating if they had simply used the business examples as illustration.

      Sure, fear sells lots of stuff. MRE's, guns, ammo, radiation pills (iodine), bomb shelters etc.... The thing that people should realize with data mining software though is that its application to terrorism and consumer tracking is new but the technology is not. In fact, people have been using it in remote sensing to prospect for gold and oil among other things from space, it has been used since the late 70's to interpret satellite images for the CIA and NRO, it has been used for psychological research etc...etc...etc... and I use a form of it for retinal research. What should not happen with the fear mongering is that the technology be given a bad name from those who want to abuse the technology. Like many technologies, data mining is a tool that can be mis-used, but its application can also do tremendous good.

      [ Reply to This | Parent ]
    Data Mining as used by Colombian Drug Cartels ... (Score:4, Interesting)
    by Anonymous Coward on Saturday January 04, @04:21PM (#5015651)
    Here is a real life story about data mining and its potential for brutal consequences. This was a very early application. Those who were fingered were killed. Of course, they adopted our new (lack of) due process rules a decade ago...

    http://www.business2.com/articles/mag/0,1640,41206 ,00.html
    [ Reply to This ]
    KnowledgeMiner 5.0 software for Mac OS 9. (Score:2, Informative)
    by alchemist68 (550641) on Saturday January 04, @04:22PM (#5015655)
    can be located here:

    http://www.knowledgeminer.net/

    I've thought about using this software to analyze stocks to purchase, but never got around to looking at the information required for the software to give me an edge in the market. Looks promising though.
    [ Reply to This ]
    obligatory dilbert strip* (Score:1)
    by RyLaN (608672) on Saturday January 04, @04:22PM (#5015656)
    (http://mssmcamp2.tripod.com/merritt)
    Panel One:
    Dogbert Consults
    My data mining software has found another message from God.
    Panel Two
    It says you've been stealing lunches from the refrigerator in the break room.
    Panel Three
    Then it says "Ha, Ha that wasn't pudding!"
    btw, that was January 3rd on the Dilbert Callender this year..
    [ Reply to This ]
    REDHAT STICKER OMG OMG (Score:-1, Redundant)
    by autopr0n (534291) on Saturday January 04, @04:25PM (#5015671)
    (http://autopr0n.com/)
    LOL LOL LOL.
    [ Reply to This ]
    Objection to the numbers (Score:4, Informative)
    by rootmonkey (457887) on Saturday January 04, @04:31PM (#5015691)
    The article use NASDAQ as an example of having to process terabytes of data on a daily basis and the data mining software can help filter things out. The software may be useful but NASDAQ does not process terabytes per day of incoming data. I work in the market data industry and we take exchange feeds from around the world including NASDAQ and we don't process close to that much. OPRA (options) have the most data per day and that is only in the order of tens of GB range.
    [ Reply to This ]
    huh??? (Score:1)
    by pummer (637413) <pummerNO@SPAMdygo.com> on Saturday January 04, @04:33PM (#5015697)
    (http://www.angelfire.com/games/pummcodes)
    i don't get it. what's that red hat thingy mean??
    [ Reply to This ]
    PR (Score:1, Informative)
    by Anonymous Coward on Saturday January 04, @04:33PM (#5015702)
    This article seemed to me more like a concatenation of a few press releases, especially the ones noting data mining successes, than "news." Then again, most news is simply rehashed PR (as a lecturer on NPR noted the other night).

    Let our Data Mining Products make your life Better!

    To save everyone time and annoying popups, consider visiting the sites of some of the products mentioned. These pages are every bit as insightful and critical as the article:

    http://www.autonomy.com/
    http://www.currentanalysis.com/
    http://www.srdnet.com/
    http://www.digimine.com/ (this didn't load for me, but I have Javascript disabled...)
    http://www.unisys.co.uk/public-uk/justice/police/d efault.asp?cn=pa

    Posting anonymously to dodge accusations of karma whoring.
    [ Reply to This ]
    Data mining companies (Score:2, Interesting)
    by MrWa (144753) on Saturday January 04, @04:34PM (#5015705)
    (http://www.hamete.com/)
    So "Data-mining companies have been among the hardest hit in recent years" is claimed by Time.com, which goes on to use MicroStrategy as a prime example of a company that skyrocketed in value and plummeted in the "tech crash" later. Oh, and by the way, they also overstated earnings. What these articles about the "tech crash" need to do is normalize the comparisions, because these companies that balloned in value so much, then crashed, probably just experienced a slight correction due to the stupid values they attained to begin with!

    As for datamining itself: more power to them. The government gaining the ability to mine the data it already have should mean that we don't need more organizations, more intrusive investigations, etc. Every report or credible news item about post-9/11 studies indicates that we already had enough information, so there should be no need to create new laws that allow for more information to be collected. Just use what you have already, kthx.
    What would be nice is if this data-mining allowed Muslims living in the U.S. to stop having to wrry whenever they go outside. Look at the information publicly available, that may provide patterns of "nonobvious" connections, and let people live thier lives in peace, regardless of background.

    As a consumer, everything I do in public I consider public information. If a business uses this to better serve me, all the better. Maybe this will mean I don't have to watch feminine ads on TV, or the phone gets answered faster when I call. Maybe it just means that the customer rep knows my name and what I bought already.

    [ Reply to This ]
    question from non-american (Score:0)
    by Anonymous Coward on Saturday January 04, @04:48PM (#5015773)
    ''Victoria's Secret stopped uniformly stocking its stores once MicroStrategy showed that the chain sold 20 times as many size-32 bras in New York City as in other cities and that in Miami ivory was 10 times as popular as black.''

    Ok. But WHY? is a size-32 bra an indication of something?
    [ Reply to This ]
    Digging For Autism Correlations (Score:2, Interesting)
    by Baldrson (78598) on Saturday January 04, @04:51PM (#5015783)
    (http://www.geocities.com/jim_bowery)
    If you look at closely at autism statistics, you'll notice it has a lower average correlation with all other statistics than 95% of the variables normally available to epidemiologists [clanarchy.com].

    So, I decided to mine almost 200 by-State demographic variables for correlates to autism by running through every combination of 2 variables via multiplication or division under a polynomial, exponential or null transformation -- then sorted them by their correlation to autism in the year 2000 [clanarchy.com].

    This is a case where what was "mined" was not just the raw data but various arithmetic combinations of statistical variables derived from the data. There needs to be some additional work to make the figure of merit, not just correlation but statistical significance. I couldn't find Perl modules that provide "alpha" (probability the null hypothesis is true) for correlations.

    [ Reply to This ]
    Uber Loyalty Card in the UK (Nectar) (Score:5, Insightful)
    by Boss, Pointy Haired (537010) on Saturday January 04, @04:58PM (#5015802)
    Three large British retail companies have recently created a joint loyalty card.

    Nectar has been set-up by Sainsbury's (a supermarket), Barclays (a financial services company) and BP (a petrol filling station company).

    I didn't mind Sainsbury's knowing that I eat junk, but now that they're telling Barclays what junk I eat I end up with Barclays putting my life insurance premiums up.

    Interesting stuff.

    [ Reply to This ]
    Fayyad (Score:1)
    by nrobert (605941) on Saturday January 04, @05:45PM (#5015996)

    In the last page, this Fayyad of digiMine claims that he doesn't want to work with the govt because the 'Bush administration' hasn't clearly enough articulated its vision of what it wants.

    I hope he was misquoted. There may be some legit reasons not to work with the US Govt. on anti-terrorism technology, but Mr. Fayadd is being either overly dismissive or just immune to opportunity by saying what he's quoted as saying. It sure is nice when the client comes to you with a fully articulated vision for the solution he needs, but most just start out with stated or even just perceived needs and leave it to the, ahem, vendors to provide the solution/vision.

    On another note, it would be interesting to read an article with some technical detail beyond a generic reference to XML. Maybe someone can post a link.

    [ Reply to This ]
      Why (Re:Fayyad) (Score:0)
      by Anonymous Coward on Saturday January 04, @07:34PM (#5016497)
      digiMine sell many different types of data mining solutions, but i believe their main focus is customer relationship management and customer segmentation.

      These areas apply to business more than they apply to Govt's...
      [ Reply to This | Parent ]
    In a nutshell... (Score:1)
    by Magus311X (5823) on Saturday January 04, @05:45PM (#5015999)
    You can mine data to look for hidden business trends. If you mine the data really hard, you can see messages from GOD.
    [ Reply to This ]
    Data Mining is the wrong term (Score:2, Interesting)
    by nrobert (605941) on Saturday January 04, @05:59PM (#5016060)
    Ther term data mining is misleading. Mining is more a matter of sifting through lots of junk to get at the valuable material. That's not exactly what 'data mining' is about.

    If you want valuable information and you know what you're looking for, you just query. Find X in pile of data. That's mining. I know it's a semantic comment, but mining's not what we're talking about doing here.

    Data mining is more like what geneticists searching for a genetic cause for a cancer are doing. Finding usable correlations and meaningful precursors. We don't call cancer-fighting biologists 'gene miners'. I think the term mining belittles a more complicated activity.

    A better term? Data Correlating? Mining also just sounds brutish.

    [ Reply to This ]
    The problem with automatic identification (Score:2, Insightful)
    by Sgs-Cruz (526085) on Saturday January 04, @06:07PM (#5016087)
    The problem with automatic identification of any specific type of person within a large group (Say, the entire U.S. population - or , hey, the entire world! Why not? ) is the obscenely low false positive rate you must have. I mean to identify 100 terrorists in 270 million people, sure, a 50% false negative rate is fine (catching 50 terrorists is better than catching none, right?), but to not get those real terrorists swamped by innocent people who happen to match a profile, then the false positive rate must be lower than about 0.000037% ... that's almost impossible to achieve. And that is why automated terrorist (or anything) identification is still a long way off.
    [ Reply to This ]
      Re:The problem with automatic identification (Score:2, Interesting)
      by nrobert (605941) on Saturday January 04, @06:30PM (#5016200)

      I'm not sure the goal is to have the miner spit out names of confirmable terrorists with that kind of accuracy. You're comment is fair if you're looking for that kind of entirely automated solution, but that's not the goal. It doesn't need to be 100% accurate in order to mitigate risk and pay for itself. Neither does the J Crew web site product predictor.

      The goal is definitely to help single out people that are worth further investigation. By motivated, thinking, observant humans. That's all.

      I also think you might be a little bit reductionist in your estimate of 100 terrorists. It's quite possible that there are many more, though I suppose it doesn't matter because even if you're looking for just one person, it's still worth doing.

      Given that you're looking for a reasonably good filter to find qualifiers for a round of investigation, a better metric to use might be the number of people you're willing to investigate as a ratio against those you hope to positively I.D. You might argue that you'd be happy to investigate 5,000 people just to find one 'terrorist'. If so, and you're looking for an estimated 100 terrorists, you can multiply to get the number of 'persons of interest' of 500,000 or .19% of the USA population. This % is much more achievable, and besides, then you use a different algo to ID which of these you should interview first or do MORE research on first.

      It seems pretty managable to me. I also think your assessment of the 50% false negative rate is too rosy. It seems to me that the risks would be serious enough of even 1 getting away (as in scanning baggage for instance) that you'd want to cast the widest net possible and then narrow those carefully. False negatives may be more costly than you are suggesting.

      [ Reply to This | Parent ]
    an advertisement for privatization of security? (Score:1)
    by fermion (181285) <mailto:lowt@bigfo%20o%20t%20.%20com> on Saturday January 04, @06:35PM (#5016225)
    (Last Journal: Friday December 20, @12:24AM)
    This article seems to explain very little of data mining, and is far from concise. The real gist of the article seems to be that data mining companies, which may be guilty of fraud and certainly seem to lack a viable business plan, are once again suckling off the teat of mother U.S.A. instead of finding the private customers that they all would claim is the basis of capitalism. Likewise, the military contractors are desperately tying to get into the data mining game to maintain relevance.

    I also take issue with the statement
    a customer whose IT ineptitude is matched only by its means
    which is clearly a jab at the hard working professionals of the US government and an effort to push privatization of IT functions. I have work with IT professionals in Academic, Industrial, Commercial, and Government settings. I will tell you that IT professionals in all these setting range from incompetent to brilliant. The difference is that, until recently, US employees have not had to live with the fear of random layoffs or arbitrary insurance reductions. I often wonder why it is unpatriotic to insult policemen, firemen, or military officers, but when it comes to the professionals that allow these people to work, no insult is severe enough.

    [ Reply to This ]
    Nice story but (Score:1)
    by Qzukk (229616) on Saturday January 04, @07:10PM (#5016390)
    *how* does data mining work? (beyond "it makes connections between various data.") I don't recall it ever coming up in any of my classes. It seems like it would be an AI problem.

    If everyone's going to go out and be paranoid, might as well know what we're being paranoid about.
    [ Reply to This ]
      Re:Nice story but (Score:0)
      by Anonymous Coward on Saturday January 04, @07:40PM (#5016537)
      According to Fayyad, "Data mining is the nontrivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data [1]".

      Basically it involves AI, machine learning and statistics amoung other things...

      [1] Fayyad, U., G. Piatetsky-Shapiro, and P. Smyth, From Data Mining to Knowledge Discovery in Databases. AI Magazine, 1996. 17: p. 37--54.
      [ Reply to This | Parent ]
      How it works (Score:1)
      by tqft (619476) <ianburrows_auNO@SPAMyahoo.com> on Sunday January 05, @04:22AM (#5018765)
      The best device I know of for turning data into information is the human visual cortex. Forget AI use HI (Human Intelligence).

      The trick is to reduce the vast amount of data to something that can be scanned at a glance.

      Typically produce a list of relevant items (eg by grabbing the doc ids based on keywords from the source data), sorting by most relevant (the scoring system). So if three keywords match in a single doc, score it high. If those three keywords appear in another doc, score both high and set the both flag. The sorted list from high score to low is then scanned. Experience soon tells you if your scoring system is working. The list you now have (electronically hopefully), has links to the original docs, the anlayst then clicks and reads. If relevant - act. If not, go to next item.

      [ Reply to This | Parent ]
      Re:Nice story but (Score:0)
      by Anonymous Coward on Sunday January 05, @06:54PM (#5022195)
      Data mining is a kind of umbrella term for a load of different machine learning and statistical techniques, when applied to a fuck-ton of data. Yes, there's some bits from AI in there, and neural nets do get used, but there's also statistical stuff like k-means clustering. Basically, any technique that can be used to form a model of all of your data, and then apply it to some more, can be used for data mining.
      [ Reply to This | Parent ]
    White Paper How to Catch a Thief (Score:1)
    by Onyxviper (540651) on Saturday January 04, @09:51PM (#5017158)
    I have not read all of this, but some of you with questions on how the actual Data Mining process works might get something out of it. Some of it is over my head, but that is not saying much. Check it out. http://sales.visualanalytics.com/whitepaper/index2 .cfm?Template=HowToCatchAThief
    [ Reply to This ]
    Define "Data mining" (Score:1)
    by ggwood (70369) on Sunday January 05, @03:22AM (#5018606)
    (http://home.socal.rr.com/tabbyandgreg/)
    I always think of artificial intelligence when I hear data mining, and I kind of assumed that was what would be clairified (at least) by this article. However I was wrong.

    The most concerete evidence of success that is presented is that Victoria's Secret realized it sold tons of size X bras in New York and 10x as many white as black items in Miami. Um, I really hope they didn't have to hire a firm to tell them that. Don't they have spreadsheets? Does anyone look around the store and notice what sells?

    Which moves me on to another point. Companies seem to have very little faith in their employees and ask very little of them these days. (Gets out his pipe and rocking chair.) I remember when my sister got her first job at an ice skating rink. They sold ice skating outfits to (mostly) Mothers of young girls taking private ice skating lessons. My sister could tell you at a glance what outfits would sell first. (As I recall it was the most garish ones - she used to specifically ask for "ugly" or "anything that it looks designed by the color blind").

    Now a days, when I have to ask for help finding something in a store and I suggest a different location for it (real life example: Why don't you stock the phone connectors with your phones?) I get blank stares and comments along the lines of "ya, like my manager would listen".

    [ Reply to This ]
    yep, me.. (Score:3, Funny)
    by geekoid (135745) <notities@yahoo.com> on Sunday January 05, @03:26AM (#5018618)
    (http://slashdot.org/ | Last Journal: Thursday February 21, @04:37PM)
    ..and six other dwarfs grab are pickaxes, and lanterns, and go to the data mines.
    those 1's and 0' can be tricky..

    [ Reply to This ]
    That's not data mining! (Score:2)
    by djkitsch (576853) on Sunday January 05, @08:06AM (#5019222)
    (http://www.kitschdesigntech.co.uk/)
    Software developed by Autonomy, based in Cambridge, England, connected BAE's research databases and alerted civilian aircraft engineers to the fact that the wing-construction problem they were working on was also being addressed by the company's military division.

    That's not exactly a task for data miners - it's just bad communication! They could have done exactly the same thing just by making sure the directors were paying attention...there seems to be a big market for telling people the perfectly obvious.
    [ Reply to This ]
    In Soviet Russia (Score:0)
    by Anonymous Coward on Sunday January 05, @09:42AM (#5019473)
    In Soviet Russia, the data mines you!
    [ Reply to This ]
    FOLDOC (Score:1)
    by gasull (92697) <(gasull) (at) (myrealbox.com)> on Sunday January 05, @01:31PM (#5020490)

    First of all, read what is data mining in the FOLDOC (Free On-Line Dictionary Of Computing) [ic.ac.uk], if you don't know.

    [ Reply to This ]
     
      The idle man does not know what it is to enjoy rest.
    All trademarks and copyrights on this page are owned by their respective owners. Comments are owned by the Poster. The Rest © 1997-2002 OSDN.
    [ home | awards | contribute story | older articles | OSDN | advertise | self serve ad system | about | terms of service | privacy | faq ]