setup local development environment for pyspark in Windows
Install Java(8+)
https://www.java.com/en/download/help/download_options.xml#windows
https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.htmlAdd to Environment Variables
JAVA_HOME C:\Programs Files\jdk1.8.0_192
add to Path C:\Program Files\Java\jdk1.8.0_191\bin
C:\Program Files\Java\jre1.8.0_191\binSetup Hadoop winutils:
Download from this GitHub repo https://github.com/steveloughran/winutils
Copy bin folder under Hadoop 2.7.1 to your location. For example C:\ProgramData\wintuils
Add to Environment Variables
HADOOP_HOME C:\ProgramData\winutils
add to Path %HADOOP_HOME%\binInstall Anaconda(5.3)
https://www.anaconda.com/download/
Downgrade python to 3.6.5 since python 3.7 may not compatible with some packagesInstall Spark(2.3.2 recommended) with Hadoop 2.7
Download .tgz from https://www.apache.org/dyn/closer.lua/spark/spark-2.3.2/spark-2.3.2-bin-hadoop2.7.tgz
You might need to install 7zip tp unzip .tgz, move unzipped folder to a separate location, for example, C:\
Add to Environment Variables
SPARK_HOME C:\spark-2.3.2-bin-hadoop2.7
Add to Path %SPARK_HOME%\bin
To verify it, open Command Prompt, cd to bin folder of SPARK_HOME, and type pysparkInstall Pycharm
https://www.jetbrains.com/pycharm/download/#section=windows
set up interpreter in pycharm, using Conda Environment and adding packages by click "+"For command line run, using pip install -r requirements.txt
Using pip freeze > requirements.txt when updating dependencies
Or open Pycharm the Settings/Preferences dialog (Ctrl+Alt+S) and select Tools | Python Integrated Tools.
In the Package requirements file field, type the name of the requirements file or click the browse button and locate the desired file.
Congratulations @weileng! You received a personal award!
Click here to view your Board of Honor
Congratulations @weileng! You received a personal award!
You can view your badges on your Steem Board and compare to others on the Steem Ranking
Vote for @Steemitboard as a witness to get one more award and increased upvotes!