Description
Discussion1:
This week we will concentrate on creating visualizations (write) and consuming visualizations (read) (Kirk, 2016). We must first learn how to understand the visualization as viewer and at the same time use our skills, knowledge and mindsets to achieve excellence in data visualization design. We will focus on The Seven Hats of Data Visualization. Select one of the Hats and expand it for your classmates.
Please provide 2 replies on the discussion posts.
Discussion 2:
Discussion 2 (Chapter 5): What is the relationship between Naïve Bayes and Bayesian networks? What is the process of developing a Bayesian networks model?
Please provide 2 replies on the discussion posts.Response should be 250-300 words. Respond to two postings provided by your classmates.
Discussion3:
Discussion 3(Chapter 6): List and briefly describe the nine-step process in con-ducting a neural network project.
Post1:
1)
Data visualization has grown rapidly in recent years as the most significant industry of data analytics. An excellent process for interpreting and analyzing data is through graphical representation of data variables to depict the hidden details and patterns. The data represented in a graphical representation with different sizes and other parameters can be a visual appeal in impacting the readers. The key information hidden within the data can be communicated effectively without ambiguity in using the data visualization methods.
User Hat – It can be argued that this is the most important hat a data display specialist should wear (in my opinion, of course). The visualization professional must ultimately provide the views of the users who use their product daily. Everyone needs to understand the profile, personality, needs / pains, and behaviors of users to ensure that users achieve the benefits they want, or more.
The interpretation of the statistical data with the required accuracy and the style for establishing a relationship is the initiator’s responsibility. When the data collected has to be identified and initiated for the next data mining levels, the initiator plays a significant role. The motivation required to establish the interconnections and the data styles is done in the first stage of data analytics.
Decorating a user cover means coming back, looking at the visualizations you create, and evaluating yourself as a user. So, when building a prototype, it is essential to get in close contact with users through workshops or any other means. It is important to understand user perspectives and be able to suggest and evaluate its features accordingly. We have seen many times that people are trapped in their own technology (which is possible with selective technology) who view the users needs with this lens. Sure, it is important, but this viability test is, in my opinion, for a later stage.
Another important thing is to allow users to get their opinion on business results. Your direct role may not guarantee this, but at least you can get to know the product you have developed, either from a usage point of view or from a technical point of view. After all, how good would a product be if it simply integrated with everything else out there?
Reference:
Kirk, A. (2012). Data Visualization: a successful design process. Packt publishing LTD.
Kirk provides an in-depth analysis of the eight hats of visualization design. Understanding the visualization hats is important in reviewing an effective approach to be employed by firms for effectiveness and efficiency.
2)
The seven hats of data visualization is a breakdown of different capabilities of a visualizer. Below are the seven hats.
- Project Manager
- Communicator
- Scientist
- Data Analyst
- Journalist
- Designer
- Technologist
Technologist: Technologist is a developer who works on the solution of a problem. He possesses the knowledge and capabilities of programming and software design. As the technologies evolve, he will have hunger and interest to adapt and learn new technologies. This makes sure he is not going outdated and not getting behind the world. As a technologist, he will have good grasp over mathematics. One of the key works that can be done by a technologist is to automate the manual process. That is so helpful to everyone as the world is getting into the automation age. His work is to keep working on the solution and keep improving it for the betterment. If required, update/change the whole solution and start from the beginning. Even after working on the solution, there has to be some testing done on the solution before rolling out to the real world, just to make sure everything works fine. Once that solution is made public, one needs to keep updating it and also provide support for the issues that users face for the solution.
Post2:
1)
Naive Bayes model is technically a special case of Bayesian networks. A Baysian network represents a set of variables as a graph of nodes, modeling dependencies between those nodes as edges. In modeling dependencies a Bayesian network must make inherent assumptions about dependence and independence between variables. In the real world, two variables are virtually never truly and completely independent. However, to make practical usage of a Bayesian network given constraints on the quantity of training data and computational resources available, data scientists simplify these networks with the approximation that variables that are nearly independent are completely independent. The Naive Bayes model does the same. The defining characteristic of the Naive Bayes model is that it makes the extremely aggressive and over-simplified assumption that all features are conditionally independent given class labels. Naive Bayes assumes that all the features are conditionally independent of each other. This therefore permits us to use the Bayesian rule for probability. Usually this independence assumption works well for most cases, if even in actuality they are not really independent.
Bayesian network does not have such assumptions. All the dependence in Bayesian Network has to be modeled. The Bayesian network (graph) formed can be learned by the machine itself, or can be designed in prior, by the developer, if he has sufficient knowledge of the dependencies.
A Bayesian network is a graphical model that represents a set of variables and their conditional dependencies.
For example, disease and symptoms are connected using a network diagram. All symptoms connected to a disease are used to calculate the probability of the existence of the disease.
Naive Bayes classifier is a technique to assign class labels to the samples from the available set of labels. This method assumes each features value as independent and will not consider any correlation or relationship between the features.
The process of developing the bayesian network model:
The first step is to decide what the variables of interest are, which will become the nodes in the BN. The abundance of native fish directly depends only on the level of pesticide in the river and the river flow, hence Native Fish Abundance a so-called “leaf node” has only those two parent nodes. RiverFlow depends on how much rain falls in a given year (Annual Rainfall), and how much of that water ends up in the river, which means it depends also on the long term Drought Conditions. The amount of pesticide in the river (Pesticide in River) depends on Pesticide Use and whether there is enough rain (Annual Rainfall) to wash it into the river. Finally, the condition of the trees on the river bank depends only on the long term drought and more recent rainfall.
2)
What is the relationship between Naïve Bayes and Bayesian networks?
Bayesian Network is more complex than the Naive Bayes; however, they nearly perform similarly well. The Bayesian Network performs more impoverished than the Naive Bayes, where all the datasets have over fifteen attributes. Recently, Bayesian Network has gotten more acknowledgment than any time in, build it as another worldview in artificial intelligence, prescient analytics, and information science (Sharda et al., 2019). Naive Bayes expects that every quality is tentatively separate from one another. Thus, this allows us to utilize the Bayesian standard for probability. Generally, Naive Bayes independent assumption functions well for most cases, if even, in fact, they are not independent. In Bayesian Network, all the dependence must be modeled. A machine can learn a created Bayesian Network automatically, or it can be designed earlier by the designer if he has adequate information on the conditions. For the Bayesian network as a classifier, the qualities are chosen dependent on some scoring tasks like the Bayesian scoring task and minimal narrative length. The scoring tasks mainly limit the design and the factors using the information. After the configuration has been learned, the class is only determined by the Markov blanket nodes, and all factors given the Markov blanket are discarded. Though for the Naive Bayesian Network, which is all the more notable these days, all qualities are considered as attributes and are independent given the class.
What is the process of developing a Bayesian networks model?
Proficient detection of model factors is viewed as critical in assessing threat or offense. Specialists model and build a structure dependent on the factors that were recognized previously. Specialists then connect the critical information to the model factors that were developed previously. The specialists then perform model definition and utilize the expectation-maximization to manage the missing or lost information. They amend the outcome of the model and propose future improvements.
Post3:
1)
1. Initialization:Step one. Import NumPy. Seriously.
2. Data Generation
Deep learning is data-hungry. Although there are many clean datasets available online, we will generate our own for simplicity for inputs a and b, we have outputs a+b, a-b, and |a-b|. 10,000 datum points are generated.
3.Train-test Splitting
Our dataset is split into training (70%) and testing (30%) set. Only training set is leveraged for tuning neural networks. Testing set is used only for performance evaluation when the training is complete.
4. Data Standardization
Data in the training set is standardized so that the distribution for each standardized feature is zero-mean and unit-variance. The scalers generated from the abovementioned procedure can then be applied to the testing set.
5. Neural net construction
We objectify a layer using class in Python. Every layer (except the input layer) has a weight matrix W, a bias vector b, and an activation function. Each layer is appended to a list called neural_net. That list would then be a representation of your fully connected neural network.
6. Forward Propagation
We define a function for forward propagation given a certain set of weights and biases. The connection between layers is defined in matrix form.
7. Back-propagation
This is the most tricky part where many of us simply do not understand. Once we have defined a loss metric e for evaluating performance, we would like to know how the loss metric change when we perturb each weight or bias.
8. Iterative Optimization
We now have every building block for training a neural network.
Once we know the sensitivities of weights and biases, we try to minimize (hence the minus sign) the loss metric iteratively by gradient descent.
9. Testing
The model generalizes well if the testing loss is not much higher than the training loss. We also make some test cases to see how the model performs.
2)
Neural networks are an advanced data mining tool s and it process elements that are organized in different ways to form network structure and its processing unit is neuron which are organized in a network, nine steps for creating a neural network project. In first step data is collected and used for training and network testing, and adequate data can be obtained. Second step will define the network structure and in this phase the data that collected is identified, then proper planning will be made for testing the network performance. In next steps network architecture planned with learning methods, the tool availability or capabilities of development personal will be determined depending on the type of neural network selected. In this stage many problems are successfully demonstrated with high rate with certain consideration such as number of neurons and number of layers, generic algorithms are used for selecting the network design. (Sharda, Delen & Turban, 2020).
In step 5 network weights and parameters are initialized once the training performance feedback is received, first generated values are important in determining the efficiency and length, and some methods will change the parameters during performance enhance training. Step 6 will transform the application data into the type and format required by the neural network, and it require software for data processing or performing the operations in neural network packages, the data storage and manipulation techniques are designed for effective retraining neural network. In steps 7 and 8 training and testing are conducted with inputs and desired output data, the network computes the outputs and adjusts the weights until the computed outputs are within an acceptable tolerance of the known outputs for the input cases. (Sharda, Delen & Turban,2020).