python – 深度学习Udacity课程：Prob 2作业1(不是MNIST)

发布时间：2020-12-16 23:01:56 所属栏目：Python 来源：网络整理

导读：阅读 this并参加课程后,我正在努力解决作业1( notMnist)中的第二个问题： Let’s verify that the data still looks good. Displaying a sample of the labels and images from the ndarray. Hint: you can use matplotlib.pyplot. 这是我尝试过的： import

阅读 this并参加课程后,我正在努力解决作业1( notMnist)中的第二个问题：

Let’s verify that the data still looks good. Displaying a sample of the labels and images from the ndarray. Hint: you can use matplotlib.pyplot.

这是我尝试过的：

import random
rand_smpl = [ train_datasets[i] for i in sorted(random.sample(xrange(len(train_datasets)),1)) ]
print(rand_smpl)
filename = rand_smpl[0]
import pickle
loaded_pickle = pickle.load( open( filename,"r" ) )
image_size = 28  # Pixel width and height.
import numpy as np
dataset = np.ndarray(shape=(len(loaded_pickle),image_size,image_size),dtype=np.float32)
import matplotlib.pyplot as plt

plt.plot(dataset[2])
plt.ylabel('some numbers')
plt.show()

但这就是我得到的：

这没有多大意义.说实话,我的代码也可能,因为我不确定如何解决这个问题！

泡菜是这样创建的：

image_size = 28  # Pixel width and height.
pixel_depth = 255.0  # Number of levels per pixel.

def load_letter(folder,min_num_images):
  """Load the data for a single letter label."""
  image_files = os.listdir(folder)
  dataset = np.ndarray(shape=(len(image_files),dtype=np.float32)
  print(folder)
  num_images = 0
  for image in image_files:
    image_file = os.path.join(folder,image)
    try:
      image_data = (ndimage.imread(image_file).astype(float) - 
                    pixel_depth / 2) / pixel_depth
      if image_data.shape != (image_size,image_size):
        raise Exception('Unexpected image shape: %s' % str(image_data.shape))
      dataset[num_images,:,:] = image_data
      num_images = num_images + 1
    except IOError as e:
      print('Could not read:',image_file,':',e,'- it's ok,skipping.')

  dataset = dataset[0:num_images,:]
  if num_images < min_num_images:
    raise Exception('Many fewer images than expected: %d < %d' %
                    (num_images,min_num_images))

  print('Full dataset tensor:',dataset.shape)
  print('Mean:',np.mean(dataset))
  print('Standard deviation:',np.std(dataset))
  return dataset

这个函数的调用方式如下：

dataset = load_letter(folder,min_num_images_per_class)
  try:
    with open(set_filename,'wb') as f:
      pickle.dump(dataset,f,pickle.HIGHEST_PROTOCOL)

这里的想法是：

Now let’s load the data in a more manageable format. Since,depending on your computer setup you might not be able to fit it all in memory,we’ll load each class into a separate dataset,store them on disk and curate them independently. Later we’ll merge them into a single dataset of manageable size.

We’ll convert the entire dataset into a 3D array (image index,x,y) of floating point values,normalized to have approximately zero mean and standard deviation ~0.5 to make training easier down the road.

解决方法

这样做如下：

#define a function to conver label to letter
def letter(i):
    return 'abcdefghij'[i]


# you need a matplotlib inline to be able to show images in python notebook
%matplotlib inline
#some random number in range 0 - length of dataset
sample_idx = np.random.randint(0,len(train_dataset))
#now we show it
plt.imshow(train_dataset[sample_idx])
plt.title("Char " + letter(train_labels[sample_idx]))

您的代码实际上更改了数据集的类型,它不是大小的数组(220000,28,28)

通常,pickle是一个保存一些对象的文件,而不是数组本身.您应该直接使用pickle中的对象来获取您的火车数据集(使用代码段中的符号)：

#will give you train_dataset and labels
train_dataset = loaded_pickle['train_dataset']
train_labels = loaded_pickle['train_labels']

更新：

根据@gsarmas的请求,我整个Assignment1解决方案的链接是here.

代码被注释并且大部分都是不言自明的,但是如果有任何问题可以通过github上的任何方式随意联系

（编辑：李大同）

【声明】本站内容均来自网络，其相关言论仅代表作者个人观点，不代表本站立场。若无意侵犯到您的权利，请及时与联系站长删除相关内容!