Posts Tagged ‘Big Data’

An aggregate in mathematics is defined as a “collective amount, sum, or mass arrived at by adding or putting together all components, elements, or parts of an assemblage or group without implying that the resulting total is whole.” While there are many uses for aggregation in data science–examples include log aggregation, spatial aggregation, and network aggregation–it always pertains to some form of summation or collection. In this article, we’ll look at the mechanics of aggregation in Apache Spark, a top-level … Read the rest

Choosing the right database for the job can be a daunting task, particularly if you’re entertaining the full space of SQL and NoSQL options. If you’re looking for a flexible, general-purpose option that allows for fluid schemas and complex nested data structures, a document database might be right for you. MongoDB and Couchbase Server are two popular choices. How should you choose?

MongoDB combines the benefits of immense popularity, support for simple graph searches, and the ability to perform SQL … Read the rest

Dumping Moore’s Law is perhaps the best thing that could happen to computers, as it’ll hasten the move away from an aging computer architecture holding back hardware innovation.

That’s the view of prominent scientist R. Stanley Williams, a senior fellow in the Hewlett Packard Labs. Williams played a key role in the creation of the memristor by HP in 2008.

Moore’s Law is an observation made by Intel co-founder Gordon Moore in 1965 that has helped make devices smaller and … Read the rest

The more cores you can use, the better — especially with big data. But the easier a big data framework is to work with, the harder it is for the resulting pipelines, such as TensorFlow plus Apache Spark, to run in parallel as a single unit.

Researchers from MIT CSAIL, the home of envelope-pushing big data acceleration projects like Milk and Tapir, have paired with the Stanford InfoLab to create a possible solution. Written in the Rust language, WeldRead the rest

Machine learning couldn’t be hotter. A type of artificial intelligence that enables computers to learn to perform tasks and make predictions without explicit programming, machine learning has caught fire among the hip tech set, but remains a somewhat futuristic concept for most enterprises. But thanks to technological advances and emerging frameworks, machine learning may soon hit the mainstream.

Consulting firm Deloitte expects to see a big increase in the use and adoption of machine learning in the coming year. This … Read the rest

MongoDB 3.4 continues the trend of databases building out support for a range of conceptual data models over the same underlying data store. This multimodel approach aims to deliver a single database that can be used to store data as documents, tables, and graphs simultaneously. The benefit to the user is a dramatically simplified infrastructure when compared to a polyglot persistence model, which might entail managing three or four separate data stores to satisfy those different use cases.

MongoDB 3.4 … Read the rest

Imagine if the files, processes, and events in your entire network of Windows, MacOS, and Linux endpoints were recorded in a database in real time. Finding malicious processes, software vulnerabilities, and other evil artifacts would be as easy as asking the database. That’s the power of OSquery, a Facebook open source project that makes sifting through system and process information to uncover security issues as simple as writing a SQL query.

Facebook ported OSquery to Windows in 2016, finally … Read the rest

Artificial intelligence is affecting everything from automobiles to health care to home automation and even sports. It’s also going to have a measurable impact on software development, with developers becoming more like data scientists, an AI official with Nvidia believes.

AI and deep learning will mean changes in how software is written, said Jim McHugh, vice president and general manager for Nvidia’s DGX-1 supercomputer, which is used in deep learning and accelerated analytics. The long-standing paradigm of developers spending months … Read the rest

It was only a matter of time until ransomware groups that wiped data from thousands of MongoDB databases and Elasticsearch clusters started targeting other data storage technologies. Researchers are now observing similar destructive attacks hitting openly accessible Hadoop and CouchDB deployments.

Security researchers Victor Gevers and Niall Merrigan, who monitored the MongoDB and Elasticsearch attacks so far, have also started keeping track of the new Hadoop and CouchDB victims. The two have put together spreadsheets on Google Docs where they … Read the rest