Big data circular visualization. Futuristic infographic. Information aesthetic design. Visual data complexity. Complex data threads graphic visualization. Social network representation. Abstract graph

The Data Science, Big Data, Data Analytics, Artificial Intelligence and Machine Learning Hype

Not only in Gartner’s Hype Cycle for Emerging Technologies but nearly in every Blog and Newsletter, the topics Data Science, Data Analytics, Big Data, Artificial Intelligence (AI) and Advanced Machine Learning (ML) is number one since some month. The hype about this technologies is on it’s top. Smart Factory (Industry 4.0) also contributes to the fact, because on of the four pillars of Smart Factory (Industry 4.0) is Data Analytics and Big Data.

But how all these relates to each other?
The base for all the listed topics is data, which is first created and saved from various sources (sensors at machines, user behavior on websites, applications and computers and many more), then archived and finally analyzed to answer specific questions, to find patterns or to show special constellations.

The data is the golden asset for a company in the future and it’s very important to save and archive the data now. It’s absolutely worthless to tell everybody that we could have all data i.e. for transactions, customer behavior, machine processes and application logs, but we don’t activate or install the necessary sensors and don’t store and archive this data. Only when much as possible data from start to end of a process will be saved, including also the data of the final result, then a person called a Data Scientist can use this data and try to answer questions which cannot answered otherwise. This leads EMC to the prediction, that “the amount of stored data is growing faster than ever before and experts states that by the year 2020, about 1.7 megabytes of new information will be created every second for every human being on the planet” [1].

But what is the difference between Data Science, Big Data, Data Analytics, Artificial Intelligence and Machine Learning?
With the recent boom about this topics, also a lot of confusion about the terms starts. First of all: There is no clear definition. Lots of companies and Universities have different definitions of that terms, but the most describes Data Science as the overall umbrella over Data Analytics, Big Data, Artificial Intelligence and Machine Learning topics. The most also use the terms Data Analytics and Data Analysis synonymously.

Big Data refers to large and complex data sets (volume & variety) that’s much larger than the traditional data sets with a higher speed of data processing (velocity). Volume, variety and velocity (called the 3Vs) are the three defining dimensions of big data. For more information about traditional data sets, you also might have a look at Do we still need a Enterprise Data Warehouse?.

When we think about the traditional “3V’s”, explained above and mainly accepted in the industry as a definition, we recognize that Enterprises have been handling that for longer than a decade now, without problem. So, there must be a other definition for Big Data.

I will stay with the 3V’s, but will mention the value we are generating for the business out of the analysis of the data. That’s the difference to simply dealing with volume, variety and velocity. So, I think, with the first ‘V’ as ‘business value’ we will be better served. Beside that, a important fact for that is, to successfully combine your analytic capabilities, your source data and your business needs. With that, our second ‘V’ should be the vision, what is required to fulfill that. The complexity of every very large enterprise today requires our new third ‘V’, virtualization to simplify and accelerate the efforts of our new first two ‘Vs’.

To explain the remaining three terms I will write separate posts, because otherwise this post will get to voluminous. So, stay excited for the next post.

Bibliography

  1. EMC: IDC Digital Universe Study: Big Data, Bigger Digital Shadows and Biggest Growth in the Far East 2011.
    Retrieved: 14.06.2017.

Original Post: https://www.redtoo.com/ch/blog/the-data-science-big-data-data-analytics-artificial-intelligence-and-machine-learning-hype/

Do We Still Need a Data Warehouse?

Do we still need a Enterprise Data Warehouse?

On the way studying for a Microsoft Data Warehouse Exam, I was asking myself, if today, a traditional enterprise data warehouse is still needed and the time I’m spending with my studies is worth it. I think there is no question that data has become more and more important and is nowadays a strategic asset for companies to transform their businesses and uncover new insights. But does a traditional data warehouse fit’s into that?

A data warehouse which is categorize as “traditional” and that’s what my studies about, has the main target to be a central repository for all historical information in a company with the assumption, that the data would be captured now but analyzed later. For this, various data from transactional systems like ERP, CRM and LOB applications are extracted, transformed and loaded (ETL), normaly first into an staging area and then cleansed and enriched and afterwards transfered into tables, that means an relational schema, in the data warehouse. The resulting data warehouse becomes the main source of information, a central version of the truth, for report generation, analysis, and presentation through ad hoc reports, portals, and dashboards.

What insiders recognized is, that the data warehouse described ahead is undergoing a transformation. Virtualization and moving resources to the Cloud is one reason. A nother reason is, that organizations try to incorporate insights from data that don’t fit the traditional relational database model and that the velocity of how that data is captured, processed and used is increasing. Companies are using now real-time data to change, build, or optimize their businesses as well as to sell, transact, and engage in dynamic, event-driven processes like market trading. The traditional data warehouse simply was not architected to support near real-time transactions or event processing, resulting in decreased performance and slower time-to-value.

A modern Data Warehouse has to support workloads of relational and non-relational data, whether they are on-premis or in the cloud and whether they use on-premis solutions or solutions and servies in the cloud. The so called “Logical Data Warehouse” (LDW) or “Modern Data Warehouse” uses repositories, virtualization and distributed processes in combination. Instead of working through a requirements-based model of the traditional data warehouse where the schema and data collected is defined upfront, advanced analytics and data science uses the experimentation approach of exploring answers to ill-formed or nonexistent questions. This requires the examination of data before it is curated into a schema allowing the data to drive insight in itself.

So the recommendation and the answer to the opened queetion is, that companies should use both approaches and for established data warehouse teams to collaborate with this new breed of data scientists as part of a move towards the logical or modern data warehouse.

Original Post: https://www.redtoo.com/blog/do-we-still-need-an-enterprise-data-warehouse/

Microsoft Azure SQL Database

Use Microsoft Azure for Ad-Hoc Testing

Microsoft Azure provides a rich set of features which can be used and setup very easy and very quick. Therefore it’s the recommended way for doing ad-hoc tests and try out quick some things. In this post I will show how to use Microsoft Azure SQL Database to quick test some Transact-SQL statements.

All interaction done with a relational database is done in SQL (Structured Query Language). SQL is a standard of both the International Organization for Standards (ISO) and the American National Standards Institute (ANSI). Microsoft’s dialect of the SQL standard, which is used to interact with Microsoft’s SQL Server and Microsoft Azure SQL Database, is called Transact-SQL (T-SQL).

T-SQL is the main language used to manage and manipulate data in Microsoft’s main relational database management system, SQL Server, whether on premise or in the cloud (Microsoft Azure SQL Database).

If you don’t have a Microsoft Azure subscription until now, you can make use of 250 CHF voucher business subscription of Microsoft Azure. Have a look at Trial Offer for Microsoft Azure for more information.

Create an Azure SQL Database

Now that you hopefully have an Azure subscription, you can create an Azure SQL Database instance to use for this post.

    1. Browse to http://portal.azure.com. If you are prompted to sign in, do so with the Microsoft account that is associated with your Azure subscription.
    2. At the bottom of the Hub menu (the vertical bar on the left), click New (represented by a + symbol if the menu is minimized), and then in the New blade that appears, click Databases, and then click SQL Database.
Create Azure SQL Database

Create Azure SQL Database

  1. In the SQL Database blade:
      1. Enter the name AdventureWorksLT
      2. In the Subscription box, ensure that your subscription is listed.
      3. In the Resource group section, ensure that New is selected, and enter TSQL_Quick_Try as the new resource group name.
      4. In the Select Source list, select Sample.
      5. In the Select sample section, ensure that AdventureWorksLT[V12] is selected.
      6. Click Server. Then click Create a new server and enter the following details and click OK.
        • A unique Server name for your server (a red exclamation mark will be displayed if the name you have entered is invalid or already in use, otherwise a green tick is shown).
        • A user name you want to assign to the Server admin login. This can be your
          name or some other name you’ll remember easily – however, you cannot use
          “Administrator”.
        • A Password for your server administrator account. This must meet the password
          complexity rules for Azure SQL Database, so for example it cannot be blank or
          “password”.
        • The Location where your server should be hosted. Choose the location nearest
          to you.
        • Leave the option to allow Azure services to access the server selected (this
          opens an internal firewall port in the Azure datacenter to allow other Azure
          services to use the database).

        New SQL Server

        New SQL Server

      7. In the Pricing Tier section, select Basic.
      8. Ensure that your selections are similar to those below, and click Create.

    SQL Server Pricing Tier

    SQL Server Pricing Tier

  2. After a short time, your SQL Database will be created, and a notification is displayed on the
    dashboard. To view the blade for the database, click Resources Groups and then click on TSQL_Quick_Try Resource Group.

    TSQL_Quick_Try Resource Group Essentials Blade

    TSQL_Quick_Try Resource Group Essentials Blade

Configure Firewall Rules for your Azure SQL Database Server

  1. In the TSQL_Quick_Try blade, under Essentials, click the server name for your database
    server (which should be in the format server_name.database.windows.net). In my case that is tsqlquicktry042.database.windows.net

    Azure SQL Server Show Firewall Settings

    Azure SQL Server Show Firewall Settings

  2. In the blade for your SQL server, under Essentials, click Show firewall settings.
  3. In the Firewall settings blade, click the Add client IP icon to create a firewall rule for your client
    computer, and then click Save.

    Azure SQL Srver Firewall Add Client IP

    Azure SQL Srver Firewall Add Client IP

Note: Azure SQL Database uses firewall rules to control access to
your database. If your computer’s public-facing IP address
changes (or you want to use a different computer), you’ll need
to repeat this step to allow access. Alternatively, you can modify
the firewall settings for your Azure SQL Database server
to allow a range of IP addresses – see the Azure SQL Database
documentation for details of how to do this.

Installing and Connecting from a Client Tool

SQL Server Management Studio is the primary management tool for Microsoft SQL Server, and you can also use it to manage and query Azure SQL Database. If you do not already have SQL Server Management Studio installed, you can download it from Download SQL Server Management Studio (16.5). When the download is complete, run the executable file to install SQL Server management Studio.

After installing SQL Server Management Studio, you can start it and connect to your Azure SQL Database server by selecting the option to use SQL Server authentication, specifying the fully-qualified name of your Azure SQL Database server (<your_server_name>.database.windows.net), and entering your user name in the format <your_user_name>@<your_server_name> and password, as shown here:

Connect to Azure SQL Database

Connect to Azure SQL Database

After connecting, you can create a new query and run it by clicking Execute, and you can save and open Transact-SQL scripts. Be sure to select the AdventureWorksLT database when running your queries as shown here:

Run Query in MS SQL Server Management Studio

Run Query in MS SQL Server Management Studio

Here is also the T-SQL Statement I tried. You can copy it and try it in your Azure SQL Database:

Original Post: https://www.redtoo.com/ch/blog/use-microsoft-azure-for-ad-hoc-testing/

A cloud for everyone on every device.

Microsoft unveils Azure tweaks before partner conference

SAN FRANCISCO – Microsoft hopes to steer attention away from this week’s layoff news with the kickoff of its Worldwide Partner Conference July 12-16 in Orlando.

While CEO Satya Nadella delivers the keynote July 13, he may have to briefly address the topic of 7,800 Nokia employees who will be let go as the software giant continues to untangle itself from that acquisition. But the focus will be on urging the company’s army of global resellers to guide customers toward its growing cloud business, Azure.

To that end, Microsoft announced Friday that its Power BI business analytics service would be coming out of beta on July 24. The service promises to allow improved, streamlined access to cloud-based data.

“We believe Power BI is, by a very wide margin, the most powerful business analytics SaaS service,” wrote James Phillips, vice president of Microsoft’s Business Intelligence Products Group, in a blog post. “And yet even the most non-technical of business users can sign up in five seconds, and gain insights from their business data in less than five minutes with no assistance, from anyone.”

The company promises to unveil other Azure improvements during the conference, which also will feature talks by COO Kevin Turner as well as inspirational speeches by the likes of Tommy Caldwell and Kevin Jorgeson. The two mountain climbers recently completed a daunting ascent of one of El Capitan’s toughest routes – the Dawn Wall – this past January in Yosemite National Park.

Original News: http://www.usatoday.com/story/tech/2015/07/10/microsoft-unveils-azure-tweaks-before-partner-conference/29999107/